An Enterprise Application Architecture is the coordinating paradigm through which an organization’s IT assets inter-relate to form the computing infrastructure and support the business goals.
The choice of architecture impacts a range of issues including the cost of maintenance, the cost of development, security, and the availability of timely information.
Architecture in the physical world often conforms to a style. For the most part we know what to expect when we hear a building described as ‘Tudor’ or ‘Victorian.’ All this really means is that the builder and architect have agreed to incorporate certain features into their design which are representative of a given school of thought for which we have a label. Similarly, there are schools of thought in Enterprise Architecture which when followed produce equally distinctive architectural results. This paper is an overview of the more prominent of these schools.
The Default Architecture
Imagine an enterprise – and for many of us this might be quite easy – in which no unifying discipline is applied to the application development and design process. Applications are created as needs are discovered, their implementation is directed by that part of the organization providing the budget, and their scope is constrained by the organizational unit within which they are conceived. The applications themselves perhaps go through a rigorous and well understood development process which ends, often as an afterthought, with an integration task in which the shiny new application is plugged into the rest of the enterprise.
Imagine, further, that we are now looking at this enterprise after this approach to application development has been practiced for years, and perhaps decades. What we will see is an evolutionary enterprise architecture in which the impact of non-technical issues, such as the personalities of the managers and the distribution of budgets between lines of business, is clearly visible in the legacy fossil record. The architecture will be a collection of seemingly arbitrarily defined and scoped applications tightly coupled to each other through hand crafted point-to-point interfaces. The problems with this architecture become evident as the enterprise grows. In theory, if every application has to share data with every other application, then the number of interfaces will approximate n(n-1)/2, where n is the number of applications.
This is an O(n2) growth in complexity, and is consequently a point of failure as the enterprise infrastructure grows, and n becomes large. In reality, of course, all of the applications do not share data with all the other applications, which suggests the increase in complexity is not necessarily exponential, but it is nonetheless a problem. John Zachman, creator of the Zachman framework, describes what occurs when we create applications in this manner as ‘post integration.’ The difficulty with post-integration, as he points out, is with semantic consistency. It becomes increasingly difficult to make sure that what we mean by a piece of data in one application is what we mean by that same piece of data in a different system which has received the data through one or more interfaces.
Controlling, and indeed just discovering, the semantics of the data is a difficult undertaking with this architecture, and without a clear definition of the semantics virtually nothing can be done with confidence. Poorly controlled semantics, of course, is exacerbated by another characteristic of the default architecture, which is uncontrolled data replication. Our multiple applications each have their own database, with their own copies of data, which they interpret in their own way. The replicated copies of data, in turn, rapidly become out of synch with each other, leading to an environment in which data meaning, currency and validity are all uncertain.
A fundamental problem with the default architecture is application coupling, by which we mean that a change to any application will have a scope of effect beyond that application. The enterprise applications are all tangled together, a bit like a ball of string. This means that changes that should logically be simple, localized and cheap end up as complex, broad ranging and expensive. The problems with the default architecture are manageable in small enterprises. It is only as the enterprise becomes larger that they become impossible to control. This paper is an overview of the successive approaches which enterprises have employed in response to this issue.
The Integrated Database Architecture
With the advent of the scaleable relational database in the mid 1980s enterprises saw a possible solution to the complexity created by the default architecture. The theory was quite compelling; the enterprise should have a single, large database which implements an enterprise wide conceptual data model. The semantics of the data would be centrally defined and under tight control. All of the application logic would operate off of the single data store, and its central definition.
Consequently there would be no growth in the number of interfaces, because there would be no need to have interfaces. There would be no data consistency or replication problems, because there would only be one copy of the data. Constraints in the database would require data to be collected completely by all parts of the application logic, and to be consistent. This was a seemingly perfect solution which most enterprises embraced enthusiastically.
The problems with the integrated database architecture did not appear immediately. The primary problem, in fact, is one of change over time. The integrated database is a great point in time solution. The problem is that once we have all of our applications based on this single database we have created, in programming terminology, a single giant ‘global variable,’ which, if changed in any way, has a potential effect everywhere. In other words the integrated database gives us tremendous data integration at the cost of extremely tight application coupling.
If, for example, we wish to change the logic of an application, perhaps to send our customers birthday cards, we are changing the same data structure – the database – which we are using to run our mission critical systems, and we are potentially also having to change those mission critical systems even though they do not care at all about birthday cards. So, at the end of the day, we are more likely than not to decide that we won’t change the mission critical systems, for reasons of risk and cost, and that we will rather forgo the birthday card function. The tightly coupled integrated database, then, has a flexibility problem.
Business processes change over time, and each change potentially impacts all of our applications. This means that making any single change is disproportionately expensive, and tends to be resisted, producing a non-responsive IT support infrastructure. When the business cannot change its core systems cost effectively, enterprising business users and IT managers will typically conclude that the obvious answer is to build ‘their own’ little system in parallel to the integrated database and then build an interface. This can often look like a constellation of Microsoft Access databases circling a mainframe, and performing both pre and post processing to support the business’ actual processes. In due course the peripheral applications become as important as the integrated database applications, and eventually the integrated database architecture begins to look like the default architecture, with one or more anomalously large applications. In the end, the integrated database architecture fails because it cannot inexpensively accommodate rapidly changing business processes, which are a hallmark of the modern enterprise.
The Distributed Object Architecture
The arrival of Object Orientation in the early 1990s heralded yet another approach to enterprise architecture. This approach said, in effect, that the problem with the integrated database architecture was one of programming. In order to create an application a developer would have to understand this large complex schema, and would then create logic to manipulate it. This logic, or behavior, defined for the data does, in effect, define the semantics of that data. Having developers re-define logic each time for core bits of functionality creates a ‘semantic drift,’ where the actually implemented behavior, from application to application, is inconsistent.
The distributed object architecture is a discipline which requires the enterprise to create an object representation of its core concepts, such as Customer, Order, and so on. When developers create an application they do so by invoking this predefined behavior, thus ensuring semantic equivalence between applications. The distributed object architecture is attractive in so far as the object analysis process extends logically from the Semantic Model, and leads to a centrally defined and controlled definition of data semantics and process. It is clearly an improvement over straight procedural logic sitting on top of a global database schema, as in the Integrated Database architecture, but it is only an incremental improvement.
The core limitations of the Integrated Database architecture, namely tight coupling and inflexibility, live on in the Distributed Object Architecture, which is not a surprise given that this approach is really nothing more than an object veneer over the integrated database. The distributed object architecture is implemented in a variety of technologies, including Enterprise Java Beans, Microsoft DCOM, and the Common Object Request Broker (CORBA).
The Message Bus Architecture
Flexibility has become one of the most important qualities of enterprise application architectures. ‘Flexibility’ is the capacity to change elements of the architecture at acceptable cost. The key to creating a flexible architecture is to decouple the independent pieces from one another, such that a change to one of the pieces does not unnecessarily require a change in any of the other independent pieces.
This capability is what has been missing from the prior architectures, and this is the primary contribution of the Message Bus Architecture. The message bus architecture returns us to an environment of independent applications maintaining their own databases. We add to this (typically) an ‘integration broker’, which is broadly responsible for communicating data between applications. The data communicated in this way is referred to as a message. By introducing the message broker as an intermediary, we are able to decouple applications from one another. Semantic consistency is enforced by representing the enterprise conceptual data model as a message model, or a centrally controlled message schema.
The n(n-1)/2 point to point to point interfaces are replaced by n interfaces – one from each application to the broker. We necessarily introduce a degree of data replication, but we control the replication through a change notification mechanism provided by the broker, typically in the form of publish/subscribe messages.
This controlled replication manages the data consistency issues, while at the same time creating a degree of ‘runtime decoupling,’ which allows the independent applications to operate even though other parts of the infrastructure may be unavailable. In this environment applications can be implemented in any technology and using whatever database schema they choose.
Their obligation to the enterprise is to generate a set of defined messages conforming to the message model, and to process incoming messages. They are free to change as and when they wish, as long as they continue to support their message contract. This is what is meant by decoupling, and this is where flexibility originates.
In a Message Bus architecture the message routing function, that is to say the logic which controls where a message is delivered, can be centralized in the broker without a loss of decoupling. When this is done, it becomes possible to see the message broker as a business process management (BPM) tool, and as a means of implementing enterprise wide workflow through the addition of rules. When fully supported by the applications, the Message Bus Architecture allows the implementation of the ‘real time enterprise,’ in which all business events, regardless of origin, appear on the message bus and can be consumed by any interested application.
This can become especially interesting when events generated by external business partners reach the internal message bus, and vice-versa. The Message Bus Architecture requires careful implementation to provide true decoupling and flexibility. It is quite possible to create a network of point-to-point logical interfaces over the single technology interface to the broker. This occurs when applications create ad-hoc messages for every integration case; the solution here is to proactively architect the message model and ensure that it is not circumvented at the application level. The Message Bus Architecture is not, however, a complete solution.
At the technology level it is usually implemented with proprietary technology for the message broker, which is expensive to buy, and requires scarce and equally expensive personnel to use. The distributed nature of the solution necessarily creates multiple points of failure – which can be mitigated through careful design to maximize runtime decoupling – and one central point of failure in the integration broker. Performance issues are also a potential problem; poor application partitioning can create excessively high volumes of messages, and some use cases can be impacted through high network latency. The Message Bus Architecture is a viable solution, but it is not a trivial implementation.
The Service Oriented Architecture is a refinement on the Message Bus Architecture. The advance with this architecture is the realization that many large granularity functions are automated in the enterprise in multiple places.
Many of our applications will do reporting, most applications will implement a user interface, most will concern themselves with security, and most will implement some form of business logic, and so on. The Service Oriented Architecture posits that the applications should be refactored and these pieces of functionality should be removed from the applications and implemented as a single ‘service’ which can be invoked at runtime. So, for example, reporting becomes a responsibility of the Information Delivery service, which might be implemented through a data warehouse, the user interface might be delegated to a portal service, the security functions will be implemented by an authentication and authorization service, and business logic perhaps by a business rules service.
The likely candidates for service orientation tend to be business neutral, in part because these functions appear repeatedly across the application inventory. The trade off of creating a service is that we have potentially created runtime-coupling between the service and the invoking application, and consequently created a point of failure. The benefit is the reduction of redundant functionality, and its central control and unification.
Taken to a logical extreme Service Orientation will allow applications to divest themselves of the responsibility for security, business logic, workflow and presentation, leaving very little beyond data store and configuration. Service Orientation can be implemented in a messaging environment using a broker, however this is not a requirement. Much of the current literature confuses the implementation technology with the concept, especially where the implementation technology is Web services.
The Orchestrated Web-Service Architecture
The latest technology trend is Web services. Web services are positioned to become the open standards based implementation of the Message Bus Architecture. Where applications currently communicate with the Message Bus using a vendor proprietary adaptor we will have a standard Web service interface instead.
Where the Message Bus Architecture performs message routing using a proprietary extensional routing – or orchestration – tool, or using intentional publish/subscribe logic, the web-service architecture will use the corresponding web service standard, at present BPEL4WS. Where the Message Bus Architecture implements guaranteed delivery through proprietary queuing mechanisms such as IBM-MQSeries and others, the Web service architecture will use upcoming standards such as HTTP-R or Web services reliable messaging.
The Web services standards are currently incomplete, and don’t fully overlap the proprietary products offerings, however the promise is clearly that in the near future Web services will offer an open standards alternative. Web services are by nature point-to point connections. Used naively this will create a technically state of the art implementation of the Default Architecture, with applications tightly bound to each other through many uncontrolled interfaces. The Orchestrated Web service architecture, consequently, introduces a broker to which all Web service calls are made, and which is responsible for forwarding those requests to the applications providing the service.
This centralized orchestration is what allows the Web service approach to remain decoupled. Similarly, by implementation asynchronous request/reply logic – which is to say the requestor does not block waiting for the reply – and by supplementing the standard Web service call over HTTP with guaranteed delivery, the broker is able to create an environment which is similar to that of the Message Bus Architecture. The Web service architecture is practical today, supplemented with various proprietary technologies. It represents an improvement over the Message Bus Architecture by being based on open standards and consequently reducing vendor lock-in.
The Just-in-time Integration Architecture
One of the interesting capabilities which the Web service technologies introduced is the concept of runtime discovery. The UDDI Web service specification allows an application to find a service at runtime, to bind to it, and invoke it. The client application searches for the service based on service categorization and conformance to an interface specification – in this case a WSDL document.
This capability allows us to conceive of an architecture in which Applications and Services create web-service interfaces and place their WSDL descriptions in the enterprise UDDI repository. When an application wishes to invoke a service it looks it up in the repository and invokes it. The key benefit of this approach is that the inter-application binding is entirely dynamic and consequently decoupled; we can replace the service provider at any time simply by changing its entry in the UDDI repository. With this approach there is no broker, and consequently there are no centrally provided management and control functions. However, in a decentralized internet based situation this maybe an appropriate architectural choice.
The choice of Enterprise Application architecture is critical to creating a successful IT infrastructure which is responsive to the business needs and which reinforces the qualities which are of value to the organization. All of the schools of architecture which are described here can be valid choices, just as building a Victorian style house is as legitimate a decision as building in a Tudor style. It is the responsibility of the architect, however, to ensure that the chosen architecture is appropriate for its environment. Although these architectural schools are evolving, and new ones are being created, most enterprises are clearly in a position to benefit from the adoption of a defined enterprise application architecture.