Six Enterprise Knowledge Graph Anti-Patterns

The truth is that despite the outsider’s perception that the world of technology just keeps getting better, faster and less expensive, over 70% of enterprise digital transformations fail to achieve their objectives (McKinsey, 2019). Most commentary on the subject points the finger at executive priorities, strategic goal-setting and organizational management. But the seldom discussed root cause of this failure is complexity, more specifically integration debt. Integration debt is the ongoing cost of integrating data from a large number of enterprise applications that each have their own local incompatible data models.

Leading organizations are starting to address this proliferation of local data models by adopting a more “data-centric” architecture. A data-centric architecture models a singular representation of the organization’s core data in a knowledge graph. The knowledge graph simplifies and harmonizes data from enterprise applications. Organizations can see a 1000-fold decrease in complexity across the enterprise data landscape.

As organizations make this move, they stumble falter and fail. This leads to what Gartner’s calls the trough of disillusionment. But it doesn’t have to. The technology is ready to deliver with the right skills and approach. With careful execution and the right Sherpa, you can make it up the mountain. The ascent is worth it. Knowledge graphs are changing the game for those who succeed. We are sharing six common anti-patterns for enterprise knowledge graphs to keep you heading in the right direction.

Anti-pattern #1 — Agreeing with the Status Quo

“This is how enterprise architecture is done”

The most progress deadening perspective you can hold is that the future is going to be just like the present. The most challenging aspect of moving to an enterprise knowledge graph is not a technical one, it is the mindset that embraces the status quo. In order to break through to those who are entrenched is to educate them and show them a better way preferably with their own data.

People who make decisions in your organization must view this technology for the radical departure that it is from relational databases, data warehouses or data lakes. If you don’t secure this distinction, you will be in danger of creating yet another silo of data, in this case a semantic silo.

You must personally step boldly enough into this mindset and educate yourself to the point that you can see a coherent vision of this next generation architecture. Your vision has to be nearly white-hot, because in the beginning no one will understand you and they will disbelieve that such a revolutionary approach is possible. Convince yourself it is possible.

But it is not just the boldness of vision but its broadness as well. The vision is a transformation for the entire organization, and you must keep this in mind. If you can find a few advocates across functional boundaries, you may be able to demonstrate what data harmonization is capable of: astounding flexibility, intricate insights, a broad purview of your enterprise, holistic yet simple.

Anti-pattern #2 — Fad Surfing

“This is cool, lets buy it and see what it does”

The tech world moves quickly from one favored technology to the next. Fad surfing refers to plucking technologies from the latest trend and trying them out. We have seen NoSQL, Hadoop, Kafka, Data Lakes, Data Warehouses, AI and Snowflake all fly by and get picked up by organizations struggling to get a handle on their data. The problem with fad surfing is that organizations get used to trying and quickly discarding technologies. This might work if you were trying out an application or a shirt but with enterprise architecture it’s a little like trying on a new foundation for your house to see how it works.

Data-centric thinking is not strictly speaking a technology. Is it an approach that uses a knowledge graph, semantic modeling and model driven everything. The move needs to square with corporate strategic planning and have people dedicated to making a longer-term investment in design, execution and integration of the core ontology. Impulse buying is not a recipe for success. There is too great a maturation phase than is sustainable by someone trying a technology out and then showing it off to gain support. With this mindset you will be forced to move on quickly perhaps feeling like you have exercised due diligence but rejecting it because either your organization or the technology was “not ready”.

Anti-pattern #3 — Too Small

“We can get that from the old system”

Have you ever done a proof of concept successfully only to have it shelfed because there was no enthusiasm for a larger implementation. It happens, a lot. The same thing will happen with your enterprise knowledge graph proof of concept if the scope is too small. If you show someone a graph solution that embodies a single domain, they feel they could have gotten the answer using the status-quo relational system.

The way to avoid this frustration is by selecting 2–3 domains that are currently difficult to reconcile. This demonstrates that even if queries could be answered before, they would have been time consuming and difficult one-offs and required a great deal of specialized knowledge to complete. This demonstrates the knowledge graph’s serviceability, its ability to answer questions easily that span difficult to reconcile data sources with different levels of abstraction or other domains that do not share a common nomenclature. In addition, it demonstrates the simplification and logical grounding of semantic concepts. By simplifying the entire structure non-experts can understand and query the data. A working example of data democratization.

Anti-pattern #4 — Too Big

“Ok, now what?”

It is possible to develop an enterprise knowledge graph that encompasses nearly the entire organization within nine months to a year if you have a team of ontologists ready to go. The problem is that when presented with an all-encompassing enterprise knowledge graph, few technical leaders know what to do with it, or even if they want to deploy this model at scale. The prospect of making a change of this magnitude drains the energy and enthusiasm of the people who contemplate it. You may have to swallow an elephant but perhaps the waiter could bring just the ear to start with.

The paradox is that you must include enough of the enterprise perspective so that you do not create an overall design that locks you into design (ontological) commitments that are structurally incompatible with other areas of your business. The key is to survey enough of the rest of the enterprise at a sufficient but not exhaustive level such that you can future proof the design to a large degree and rework is minimal. One good thing is that with a graph data structure even rework is less destructive than relational models that have rigid schemas and deep structural commitments.

The use of a minimalist upper-level ontology designed for such a purpose can help by establishing reusable patterns that can be applied in one domain today and somewhere else a year from now with consistent results. We are biased but Gist, the ontology from Semantic Arts, has over 100 enterprise implementations under its belt and is both simple and comprehensive in scope. It provides a set of covering concepts, semantic primitives, that can be combined to describe your organization and sub domains with minimal extension. So think big, looking broadly at the enterprise, but start small with a couple of sub-domains.

Anti-pattern #5 — Data Governance

“We’ve got this covered”

Data governance initiatives are a growth industry, with good reason. Organizations know they face an overwhelming challenge to make sense of all of their data. They also have the obligations to secure personally identifiable PII data, comply with domestic and international data regulations and be able to provide audit support. This is all before enhancing the ability to find and reuse data for the purposes of analysis and operational improvements.

So called master data management initiatives designate certain data in the organization as “golden records” then establish a practice around managing and proliferating the most up-to-date version of those golden records. This is seen as a solution in today’s enterprise environment. What it really is is the overgrowth of a set of technologies and practices that are built to correct for the shortcomings of the application-centric world we live in. We are becoming expert at solving the wrong problem and waste effort paying the wages of integration debt. The solution is data-centric. In a data-centric world there would not be literally thousands of instances of the same social security number (a real example).

The anti-pattern here is to believe that since you have a master data management initiative you are solving the problem of organizing your data and making it maximally findable and usable. You are not. In some ways it makes the problem worse by entrenching systems of records and adding additional layers of automation to ship updated data around at greater cost and effort.

Anti-pattern #6 — Data Hoarding

“It’s too valuable to share”

Some organizations are better than others at sharing data among internal groups. There are good reasons why data should be kept secure. We have all heard of incidents of data breeches that have cost companies lots of money and prestige. Data hording though, is a practice of going beyond protecting data to preventing its use by others with legitimate needs. This is a pathological behavior that is typically borne from the perception that the data is valuable and sharing it with others would reduce me or my team’s value to the organization. Currently, people must come and have permission to get summary level data from the data hoarders, and they like it.

Two lessons here, first if you or someone you know is a hoarder try to get help. If data hoarders cannot be reasoned with, and some cannot, you should identify them early in the process and not count on them to participate. Instead, develop the system elsewhere until it has enough demonstrable value and political support to perhaps persuade them to come along. The process can be a positive one. Sometimes data hoarders can be made to see the value of the harmonization of data. They themselves can see how combining other’s data with their own can provide value to them. Seeing this, it may be possible that they come around to seeing value in living in harmony in the graph.

Need a sherpa to get up the mountain?

CONTACT US

Originally posted at Medium.com