Semantic technologies can help it organizations improve productivity by creating simpler, more elegant enterprise architectures. Thanks to its unique approach to organizing, identifying and reusing elements, a semantic model can reduce the size of a schema by a factor of 100. Reducing the complexity of a shared schema can greatly improve a federated information system’s productivity, flexibility and performance.
Elegance is the art of reducing complexity in shared schemas, which can greatly improve the productivity, flexibility and performance of a federated information system.
For the past three weeks, I have been blogging about how to use Semantic Technology to reduce schema complexity and consequently lower IT Total Cost of Ownerships. To get a comprehensive view of these ideas, please download my whitepaper: Elegance: A New Approach to Enterprise Modeling.
Other Semantic Arts Whitepapers that might Interest you:
The Executive’s Essential Guide To Semantic Technology
What you need to know about the upcoming semantic technology revolution and web 3.0
Semantic Technology Comes Of Age In A Brave New World
How to leverage your legacy it investments to build competitive advantage
The Bottom Line – The Promise Of Semantic Technology
How semantic technology can optimize your company’s it total cost of ownership
The world of traditional information technology is black or white. If something isn’t the same, then it is different. Every new distinction requires the creation of a new table. This creates a problem because once a new table is created the concept is considered new and unique from every other concept, causing redundancy and confusion. In the semantic world, shades of grey are tolerated. Once you formally define a concept, the semantic model creates a ‘web of similarity’ enabling the inference engine to associate the new concept to other classes that are similar. Queries are able to search not only for explicit information defined in the black and white world, but also for the ‘grey’ information found in the inferred subclasses. This enables users to ask much simpler questions and get much more complete information in return.
Inference is one of the ways that semantic technology simplifies information systems. Many of the manual assertions that must be made in the traditional systems can happen automatically in a semantic model. The more things you can infer, the fewer you need to assert, which greatly simplifies the system.
Another way semantic technology reduces complexity is that it reuses predefined concepts through composition. This might sound a lot like the inheritance concept that is widely used in object oriented programming, but it is not. Let’s say that we want to define a trip to the doctor as an event that happens in a clinic. We would call that concept a ‘patient visit.’ Like object-oriented inheritance, the patient visit will inherit some of the attributes associated with the event, such as the date. Unlike object-oriented inheritance, the clinic concept contributes to the definition of what a patient visit is, but the patient visit doesn’t inherit any of the attributes of a clinic. A visit isn’t a type of clinic, yet the clinic is part of the definition of ‘patient visit.’
Semantic concepts are like words that can be assembled in an infinite number of ways. Objects in object-oriented programming are more like phrases; they can definitely be reused but with many more limitations. Reusing classes and properties significantly reduces the number of attributes required to model a business process, so we have much more flexibility in how concepts can be reused.
Since the semantic model is free of structure, it is easier to map concepts to different structural representations enabling reuse of classes and properties.
Reuse is a profound way to reduce complexity. In a traditional system, attributes are not reused. Every time you create a new table and put new attributes on it you’ve created additional attributes. Even if you gave them the same name, there is no guarantee that on this new table they might not mean something different. The technology treats them as if they are different.
Property reuse enables a drastic reduction in the number of properties needed to represent the complexity of a large domain. In semantics, properties are first-class objects, which means that they exist independent of any class or table. They can be reused and still retain their original meaning, so you need fewer of them. Additionally, because properties are first-class objects we can define relationships between properties.

For example, if we declare that the property “hasParent” is a sub-property of the property “has Ancestor” then anyplace we assert that someone has a particular Parent, a semantic reasoner infers that that particular person is also an ancestor of the first person.
In semantic modeling the definition of the classes and properties is separate from the process of building applications or database structures. This separation frees up the modeler to focus on the meaning and inclusion criteria for each class and intentionally avoid having to make decisions about how to store, structure or organize the information within the system.
Semantic technology uses ontologies to describe the business in a way that both humans and machines can understand. Since the semantic schema is independent from actual computer systems, e.g., legacy or future applications and databases, it allows us to find the commonalities across business processes, which serves to greatly simplify the enterprise architecture.
Generally, people believe that the real world is complex and messy and that our information systems bring order out of chaos. However, the opposite is true. The real world tends to be much simpler than how we represent it in our database and applications schemas. For example, take a stapler. Our purchasing system refers to it as an item with a price; our manufacturing system breaks it down into an itemized list of plastic and metal and our administrative system keeps track of how many of them are on the shelf above the printer. However, in the real world it is just a thing that shoots a bit of metal through sheets of paper to fasten them together. Since semantic technology defines things separately from the applications or databases, you only have to do it once. In traditional IT systems, each thing is redefined within the context of what the information system is tracking, thereby creating multiple references about a single thing.
Sentara Healthcare is a $3 billion conglomerate that includes hospitals, clinics, physician networks, insurance companies, research centers, etc. It offers thousands of services to over two million patients. It is a very complex business. Always on the cutting edge of applying information systems technology to improving healthcare delivery, Sentara engaged Semantic Arts to build the first comprehensive model of a healthcare delivery system. After an exhaustive analysis of the thousands of interrelated processes required to run a complex healthcare business, the Sentara Healthcare Enterprise Ontology included only 1,276 classes and 397 properties. We are currently applying the ontology as a common denominator to align internal and external data without needing to make any significant increases or structural changes to the definitional schema.
These are examples of how large, complex organizations with hundreds of thousands of elements in their collective schemas can create integrated models of reasonable coverage and fidelity with around 1,000 – 2,000 concepts. By taking a semantic approach to building these simple, elegant and powerful schemas, we have found a way to reduce complexity.
At Procter & Gamble, over 10,000 people work in R&D in hundreds of different disciplines. P&G had no traditional information systems capturing the critical information in this brain trust. They engaged Semantic Arts to create an ontology that would organize this vast body of unstructured information into a definitional model that could be used to build future databases and applications. This exercise resulted in the creation of an ontology that had only 400 classes and 200 properties.
We then decided to see how the model would change as we drilled down to model the specific elements of actual lines of business. We applied the definitional model to two lines of business, batteries and toothbrushes. We defined these businesses to the level of detail required to build a new database enabling cross-disciplinary search. Because there were many more details to represent, this effort increased the number of classes by 50% while adding only four new properties to the ontology.
Considering that a property is the equivalent of a new column in a database table, this exercise proved the elegance of the new schema. The increased simplicity will enable P&G researchers to conduct cross-functional searches without an intimate knowledge of the database structure, making them more efficient, effective and autonomous.
Reducing the complexity of a shared schema can greatly improve a federated information system’s productivity, flexibility and performance. In most large enterprises there are between 100,000 and 1,000,000 elements in the collective applications and message schemas, yet, in working with our clients, we have learned that only 1,000-2,000 concepts make a critical difference in defining the essential processes that manage a business.
Think about it. To reduce 100,000 concepts down to 1,000 reduces your complexity by a factor of 100. To reduce a million concepts down to a thousand reduces your complexity by a factor of 1,000!
So what does that mean? With only 1,000 concepts, software designers can understand the entire model. They can see the entire schema on one page. They can share the model with managers, who for the first time can actually see how the organization works, which enables them to govern the change process. It empowers executives to ask for changes that help them run the business better. It inspires change.
Reducing complexity by a hundredfold is a bold statement. It might even make you skeptical of our claims. Well, we can prove it. Tune in again for the next two days for some real world examples of where we have been able to reduce the enterprise models of large, complex organizations with hundreds of thousands of elements down to 1,000 – 2,000 concepts.
Some people make money the old fashioned way… they inherit it. Legacy systems are the primary cause of schema complexity. Thanks to traditional design approaches, limitations imposed by tool sets, and bad habits, we create and perpetuate systems that are too ‘big to fail.’
Let’s look at some common examples:
- One bad habit in the database management world is table proliferation. Typically, when someone encounters a new type of data, they build a new table. Because they want to accurately reflect what is unique about the data, the new tables have lots of new columns with new names. Now, applications that want to access the information in the new table need a new application interface, which in turn requires more new processes to support it.
- Another common source of complexity is the use of packaged software. Practical business users don’t want to ‘reinvent the wheel,’ so they buy off-the-shelf software applications or ‘software as a service.’ Each package or service has its own conceptual, logical and physical model. To use the software it needs to be customized, requiring new systems integration application interfaces, creating more complexity.
- Sometimes complexity is created by a lack of clarity about the business concept or process that the information is supposed to represent. Different people use different terms to describe things that are very similar. For example, in a recent financial application we studied, we found 24 different data types to describe how the organization handles money that is promised: Budgeted, Estimated, Approximate, Allocated, Allotted, Predicted, SetAside, Projected, Earned, Reserved, Assumed, Committed, HeldBack, Granted, Expected, Unreconciled, Proforma, Assessed, Planned, Allowed, Appropriation, Funded, Granted, and Designated.
The impact of these common problems is that, in most large organizations, federated systems, e.g., systems of systems, have become unmanageably complex.
Information systems are made up of nine basic elements: data, content, users, schemas, code, applications, user interfaces, application interfaces and processes. Let’s look at the impact of growing each of these elements on the complexity, and resulting costs, of the information system.

If you increase a database from 1,000 records to a million records you see an improved economy of scale. With a minimal amount of incremental effort, you could create a much more comprehensive information repository. Similarly if the number of users increased by a hundredfold, the system would be more productive and it would deliver a higher ROI.

On the other hand, if you increase your database schema from 1,000 to a million columns or elements, the cost of managing the increased complexity would skyrocket. Other common causes of increased complexity are all the interfaces we build between applications, applications and users, and applications and databases. Shortsighted systems integration patches greatly increase the complexity of the systems with millions of lines of code, creating significant dis-economies of scale as the system grows.
Other elements, such as content and processes, have a linear relationship between complexity and scale. Adding more elements costs you more money, but these increases don’t have a remarkably positive or negative impact on the ROI of the technology investment.