Introduction to FIBO Quick Start

We have just launched our “FIBO Quick start” offering.  If you are in the financial industry you likely have heard about the Financial Industry Business Ontology, which has beenFIBO championed by the EDM Council, a consortium of virtually the entire who’s who of the financial industry. We’ve been helping with FIBO almost since its inception, and more recently Michael Uschold has be co-leading the mortgage and loan ontology development effort.  Along the way we’ve done several major projects for financial clients, and have reduced what we know to a safe and quick approach to adopting semantics in the financial sector. We have the capacity to take on one more client in the financial space, so if you’re interested, by all means contact us.

FIBO Quick Start: Developing Business Value Rapidly with Semantics

The Financial Industry Business Ontology is nearing completion. As of June 2016, nine major financial institutions have joined the early adopter program. It is reasonable to expect that in the future all Financial Industry participants will have aligned some of their systems with FIBO. Most have focused their initial projects on incorporating the FIBO vocabulary. This is a good first step and can jump start a lot of compliance work.

But the huge winners, in our opinion, will be the few institutions that see the potential and go all-in with this approach. For sixteen years, we have been working with large enterprises who are interested in adopting semantic technology. Initially, our work focused on architecture and design as firms experimented with ways to incorporate these new approaches. More recently, we have been implementing what we call the “data-centric approach” to building semantically-centered systems in an agile fashion.

Click here to read more. 

Data-Centric and Model Driven

Model Driven Development

Model Driven seems to be enjoying a bit of an upsurge lately.  Gartner has recently been hyping (is it fair to accuse the inventors of the hype curve of hyping something? or is it redundant?) what they call “low code/ no code” environments.

Perhaps they are picking up on and reporting a trend, or perhaps they are creating one.

Model Driven Development has been around for a long time. To back fill what this is and where it came from, I’m going to recount my experience with Model Driven, as I think it provides a first person narrative for most of what was happening in the field at the time.

I first encountered what would later be called Model Driven in the early 80’s when CAD (Computer Aided Design—of buildings and manufactured parts) was making software developers jealous.  Why didn’t we have workbenches where we could generate systems from designs?  Early experiments coalesced into CASE (Computer Aided Software Engineering).  I was running a custom ERP development project in the early 80’s (on an ICL Mainframe!) and we ended up building our own CASE platform.  The interesting thing about that platform was that we built the designs on recently acquired 8-bit microcomputers, which we then pushed to a compatible framework on the mainframe.  We were able to iterate our designs on the PCs, work out the logistical issues, and get a working prototype UI to review with the users before we committed to the build.

The framework built a scaffold of code based on the prototype and indicated where the custom code needed to go.  This forever changed my perspective on how systems could and should be built.

What we built was also being built at the same time by commercial vendors (we did this project in Papua, New Guinea and were pretty out of the loop as to what was happening in mainstream circles).  When we came up for air, we discovered what we had built was being called “I-CASE” (Integrated Computer Aided Software Engineering), which referred to the integration of design with development (seemed like that was the idea all along).  I assume Gartner would call this approach “low code” as there still was application code to be written for the non-boiler-plate functionality.

Next stop on my journey through model driven was another ERP custom build.  By the late 80’s a few new trends had emerged.  One was CAD was being invaded by parametric modeling.  Parametric modeling recognizes that many designs of physical products did not need to be redesigned by a human every time a small change was made to the input factors.  A motor mount could be designed in such a way that a change to the weight, position, and torque would drive a new design optimized for those new factors. The design of the trusses for a basketball court could be automatically redesigned if the span, weight, or snow load changed and the design of big box retail outlets could be derived from, among other things: wind shear, maximum rainfall, and seismic potential.

The other trend was AI (remember AI?  Oh yeah, of course you remember AI, which you forgot about from the early 90’s until Watson and Google’s renaissance of AI).

Being privy to these two trends, we decided to build a parametric model of applications and have the code generation be driven by AI.  Our goal was to be able to design a use case on a post-it note. We didn’t quite achieve our goal.  Most of our designs were up to a page long.  But this was a big improvement over what was on offer at the time.  We managed to generate 97% of the code in this very sophisticated ERP system.  While it was not a very big company, I have yet to see more complex requirements in any system I have seen (lot based inventory, multi-modal outbound logistics, a full ISO 9000 compliant laboratory information management system, in-line QA, complex real time product disposition based on physical and chemical characteristics of each lot).

In the mid 90’s we were working on systems for ambulatory health care.  We were building semantic models for our domain.  Instead of parametric modeling we defined all application behavior in a scripting language called tcl. One day we drew on a white board where all the tcl scripts fit in the architecture (they defined the UI, the constraint logic, the schema, etc.) It occurred to us that with the right architecture, the tcl code, and therefore the behavior of the application, could be reduced to data.  The architecture would interpret the data, and create the equivalent of application behavior.

We received what I believe to be the original patents on fully model driven application development (patent number 6,324,682).  We eventually built an architecture that would interpret the data models and build user interfaces, constraints, transactions, and even schemas.  We built several healthcare applications in this architecture and were rolling out many more when our need for capital and the collapse of the .com bubble ended this company.

I offer this up as a personal history of the “low code / no code” movement.  It is not only real, as far as we are concerned, but its value is underrepresented in the hype.

Data-Centric Architecture

More recently we have become attracted to the opportunity that lies in helping companies become data-centric.  This data-centric focus has mostly come from our work with semantics and enterprise ontology development.

What we discovered is that when an enterprise embraces the elegant core model that drives their business, all their problems become tractable.  Integration becomes a matter or conforming to the core.  New system development becomes building to a much, much simpler core model.

Most of these benefits come without embracing model driven.  There is amazing economy in reducing the size of your enterprise data model by two orders of magnitude.

Click here to read more on TDAN.com

The Evolution of the Data Centric Revolution Part Two

In the previous installment (The Data Centric Revolution: The Evolution of the Data Centric Revolution Part One), we looked at some of the early trends in application development thatThe Data-Centric Revolution foreshadowed the data centric revolution, including punched cards, magnetic tape, indexed files, databases, ERP, Data Warehouses and Operational Data Stores.

In this installment, we pick up the narrative with some of the more recent developments that are paving the way for a data centric future.

Master Data Management

Somewhere along the line, someone noticed (perhaps they harkened back to the reel-to-reel days) that there were two kinds of data that are, by now, mixed together in the databased application: transactional data and master data.  The master data was data about entities, such as Customers, Vendors, Equipment, Fixed Assets, or Products.  This master data was often replicated widely. For instance, every order entry system has to have yet another Customer table because of integrity constraints, if nothing else.

If you could just get all the master data in one place, you’d have made some headway.  In practice, it rarely happened. Why? In the first place, it’s pretty hard.  Most of the MDM packages are still using older, brittle technology, which makes it difficult to keep up with the many and various end-points to be connected.  Secondly, it only partially solved the problem, as each system still had to maintain a copy of the data, if for nothing else, for their data integrity constraints.  Finally, it only gave a partial solution to the use cases that justified it. For example, the 360o view of the customer was a classic justification, but people didn’t want a 360 o view of the master data; they wanted to see the transaction data.  Our observation is that most companies that had the intention to implement several MDMs  gave up after about 1 ½ years when they found out they weren’t getting the payout they expected.

Canonical Message Model

Service Oriented Architecture (SOA) was created to address the dis-economy in the system integration space.  Instead of point-to-point interfacing, you could send transactional updates onto a bus (the Enterprise Service Bus), and allow rules on the bus to distribute the updates to where they are needed.

The plumbing of SOA works great.  It’s mostly about managing messages and queues and making sure messages don’t get lost, even if part of the architecture goes down. But most companies stalled out on their SOA implementations because they had not fully addressed their data issues.  Most companies took the APIs that each of their applications “published” and then put them on the bus as messages.  This essentially required all the other end-points to understand each other.  This was point-to-point interfacing over a bus.  To be sure, it is an improvement, but not as much as was expected.

Enter the Canonical Message Model.  This is a little-known approach that generally works well, where we’ve seen it applied.  The basic concept is to create an elegant [1] model of the data that is to be shared.  The trick is in the elegance.  If you can build a simple model that captures the distinctions that need to be communicated, there are tools that will help you build shared messages that are derived from the simple model.  Having a truly shared message is what gets one out of the point-to-point trap. Each application “talks” through messages to the shared model (which is only instantiated “in motion,” so the ODS problem versioning is much easier to solve), which in turn “talks” to the receiving application.

Click here to continue reading on TDAN.com

Semantic Modeling: Getting to the Core

Most large organizations have a lot of data and very little useful information. The reason being, every time they encounter a problem, they build (or more often buy) another computer application system. Each application has its own completely arbitrary data model designed for the task at hand, at that time, and which used whatever simplification seemed appropriate in that instance.

The net result, depending on the size of the organization, is hundreds or thousands of applications— occasionally, tens of thousands—each with its own data model. Each data model has hundreds to thousands of tables, occasionally, tens of thousands (the average SAP install has 95,000 tables), and each table has dozens of columns. The net result is trying to run your company using upwards of millions of distinct data types. For all practical terms, this is impossible.

Most companies spend most of their (very high) IT budget on maintaining these systems (as they are very complex) or attempting to integrate them (and doing a very partial job of it).

This seems pretty bleak and makes it hard to see a way out. What will drop the scales from your eyes is when you see a model that covers all the concepts you use to run your business that has just a few hundred concepts—a few hundred concepts—with a web of relationships between them. Typically, this core is then augmented by thousands of “taxonomic” distinctions; however, these thousands of distinctions can be organized and put into their place for much better management and understanding.

data model

Once you have this core model (or ontology, as we call it, just to be fancy), everything becomes simpler: integration, because you map the complex systems to the sample and not to each other, and application development, because you build on a smaller footprint. And it now becomes possible to incorporate types of data previously thought un-integrate-able, such as unstructured, semi-structured, and/or social media data.

Semantic Arts has built these types of core data models for over a dozen very large firms, in almost as many industries, and helped to leverage them for their future information systems.  We now can do this in a very predictable and short period of time.  We’d be happy to discuss the possibilities with you.

Feel free to send us a note at [email protected].

Written by Dave McComb

The Evolution of the Data-Centric Revolution Part One

We have been portraying the move to a Data-Centric paradigm as a “Revolution” because of the major mental and cultural shifts that are prerequisites to making this shift. In another sense, the shift is the result of a long, gradual process; one which would have to be characterized as “evolutionary.”

This column is going to review some of the key missing links in the evolutionary history of the movement.

(For more on the Data Centric Revolution, see The Data Centric Revolution. In the likelihood that you’re not already data-centric, see The Seven Warning Signs of Appliosclerosis)

Applications as Decks of Cards

In the 50’s and 60’s, many computer applications made very little distinction between data and programs. A program was often punched out on thin cardboard “computer cards.” The data was punched out on the same kind of cards. The two decks of cards were put in the hopper together, and voila, output came out the other end. Payroll was a classic example of applications in this era. There was a card for each employee with their Social Security Number, rate of pay, current regular hours, overtime hours, and a few other essential bits of data. The program referred to data by the “column” numbers on the card where the data was found. Often people didn’t think of the data as separate from the program, as the two were intimately connected.

Click here to view on TDAN.com

The Data-Centric Revolution: The Warning Signs

Of all the dangers that befall those on the journey to data centrism, by far the greatest danger is Appliosclerosis. Appliosclerosis, or as lay people know it- hardening of the silos, can strikewarning signs any one at any time, but some are more prone to it than others. By the time Appliosclerosis has metastasized it may be too late, isolated and entrenched data models may already be firmly established in various vital departments, and extreme rationalization therapy may be the only option, perhaps followed by an intense taxo regimen.

In this brief piece we will lay out the symptoms that are most associated with the condition, and steps you can take to avoid early onset.

  • Warning Sign 1: Fear of New Wheels – one of the most consistent early behavioral predictors of Appliosclerosis is Wheelophobia. This usually begins with executives making statements such as “Let’s not reinvent the wheel here.” This is an innocuous sounding bromide, after all, who wants yet another wheel? But this cliché is a Trojan Horse, and each of the Greek soldiers that come out of its belly carries the gift of yet another incompatible data model. Before you know it, the intention to avoid new wheels leaves the afflicted with a panoply of arbitrarily different and disconnected data models.
  • Warning Sign 2: The Not (Not Invented Here Syndrome) – Curiously one of the most potent antibodies against Wheelophobia is the “Not Invented Here Syndrome (NIHS)” Those afflicted with NIHS (and it is generally believed to be hereditary) have a predisposition to custom build information systems whenever given a chance. While this does have many negative side effects, the positive side effect is that it curtails Appliosclerosis through two mechanisms. The first is by starving the nascent Applio tumors from developing by denying resources. The second is that NIHS is a very slow growing condition. Most organizations die with NIHS, not from it. The slow growth prompts some organizations to suppress the NIHS antibody with the bio-reactive NIHS complement (!NIHS).

Click here to read more on TDAN.com

What’s exciting about SHACL: RDF Data Shapes

An exciting new standard is under development at the W3C to add some much needed functionality to OWL. The main goals are to provide a concise, uniform syntax (presently called SHACL for Shapes Constraint Language) for both describing and constraining the contents of an RDF graph.  This dual purpose is what makes this such an exciting and useful technology.

RDF Data Shapes

What is a RDF Data Shape?

An RDF shape is a formal syntax for describing how data ishow data should be, or how data must be.

For example:

ex:ProductShape 
	a sh:Shape ;
	sh:scopeClass ex:Product ;
	sh:property [
		sh:predicate rdfs:label ;
		sh:dataType xsd:string;
		sh:minCount 1;
		sh:maxCount 1;
	];
	sh:property [
		sh:predicate ex:soldBy;
		sh:valueShape ex:SalesOrganizationShape ;
		sh:minCount 1;
	].

ex:SalesOrganizationShape
	a sh:Shape ;
	sh:scopeClass ex:SalesOrganization ;
	sh:property [
		sh:predicate rdfs:label ;
		sh:dataType xsd:string;
		sh:minCount 1;
		sh:maxCount 1;
	];

This can be interpreted as a description of what is (“Products have one label and are sold by at least one sales organization”), as a constraint (“Products must have exactly one label and must be sold by at least one sales organization”), or as a description of how data should be even if nonconforming data is still accepted by the system.  In the next sections I’d like to comment on a number of use cases for data shapes.

RDF Shapes as constraints

The primary use case for RDF data shapes is to constrain data coming into a system.  This is a non-trivial achievement for graph-based systems, and I think that the SHACL specification is a much better solution for achieving this than most.  Each of the SHACL atoms can, in principle, be expressed as an ASK query to evaluate the soundness of a repository.

RDF Shapes as a tool for describing existing data

OWL ontologies are good for describing the terms and how they can be used but lack a mechanism for describing what kinds of things have been said with those terms.  Data shapes fulfill this need nicely, which can make it significantly easier to perform systems integration work than simple diagrams or other informal tools.

Often in the course of building applications, the model is extended in ways that may be perfectly valid but otherwise undocumented.  Describing the data in RDF shapes provides a way to “pave the cow paths”, so to speak.

A benefit of this usage is that you get the advantages of being schema-less (since you may want to incorporate data even if it doesn’t conform) while still maintaining a model of how data can conform.

Another use case for this is when you are providing data to others.  In this case, you can provide a concise description of what data exists and how to put it together, which leads us to…

RDF Shapes as an outline for SELECT queries

A nice side-effect of RDF shapes that we’ve found is that once you’ve defined an object in terms of a shape, you’ve also essentially outlined how to query for it.

Given the example provided earlier, it’s easy to come up with:

SELECT ?product ?productLabel ?orgLabel WHERE {
	?product 
		a ex:Product ;
		rdfs:label ?productLabel ; 
		ex:soldBy ?salesOrg .
	?salesOrg
		a ex:SalesOrganization ;
		rdfs:label ?orgLabel .
}

None of this is made explicit by the OWL ontology—we need either something informal (e.g., diagrams and prose) or formal (e.g., the RDF shapes) to tell us how these objects relate in ways beyond disjointedness, domain/range, etc.

RDF Shapes as a mapping tool

I’ve found RDF shapes to be tremendously valuable as a tool for specifying how very different data sources map together.  For several months now we’ve been performing data conversion using R2RML.  While R2RML expresses how to map the relational DB to an RDF graph, it’s still extremely useful to have something like an RDF data shapes document to outline what data needs to be mapped.

I think there’s a lot of possibility for making these two specifications more symbiotic. For example, I could imagine combining the two (since it is all just RDF, after all) to specify in one pass what shape the data will take and how to map it from a relational database.

The future – RDF Shapes as UI specification

Our medium-term goal for RDF shapes is to generate a basic UI from a shapes specification. While this obviously wouldn’t work in 100% of use cases, there are a lot of instances where a barebones form UI would be fine, at least at first.  There are actually some interesting advantages to this; for instance, validation can be declared right in the model.

For further reading, see the W3C’s SHACL Use Cases and Requirements paper.  It touches on these use cases and many others.  One very interesting use case suggested in this paper is as a tool for data interoperability for loose-knit communities of practice (say, specific academic disciplines or industries lacking data consortia).  Rather than completely go without models, these communities can adopt guidelines in the form of RDF shapes documents.  I can see this being extremely useful for researchers working in disciplines lacking a comprehensive formal model (e.g., the social sciences); one researcher could simply share a set of RDF shapes with others to achieve a baseline level of data interoperability.

Governance in a Data-Centric Environment

How a Data-Centric Environment Becomes Harder to Govern

A traditional data landscape has the advantage of being extremely silo-ed.  By taking your entire data landscape and dividing it into thousands of databases, there is the potential that each database is small enough to be manageable.

As it turns out this is more potential than actuality.  Many of the individual application data models that we look at are individually more complex than the entire enterprise model should be.  However, that doesn’t help anyone trying to govern.  It is what it is.

What is helpful about all this silo-ization is that each silo has a smaller community of interest.  When you cut through all the procedures, maturity models and the like, governance is a social problem.  Social problems, such as “agreement,” get harder the more people you get involved.

From this standpoint, the status quo has a huge advantage, and a Data-Centric firm has a big challenge: there are far more people whose agreement one needs to solicit and obtain.

The other problem that Data-Centric brings to the table is the ease of change.  Data Governance likes things that change slower than the process can manage.  Often this is a toss-up.  Most systems are hard to change and most data governance processes are slow.  They are pretty much made for each other.

I remember when we built our first model driven application environment (unfortunately we chose health care for our first vertical).  We showed how you could change the UI, API, Schema, Constraints, etc.  in real time.  This freaked our sponsors out.  They couldn’t imagine how they would manage [govern] this kind of environment.  In retrospect, they were right.  They would not have been able to manage it.

This doesn’t mean the approach isn’t valid— it means we need to spend a lot more time on the approach to governance. We have two huge things working against us: we are taking the scope from tribal silos to the entire firm and we are increasing the tempo of change.

How a Data-Centric Environment Becomes Easier to Govern

A traditional data landscape has the disadvantage of being extremely silo-ed.  You get some local governance being silo-ed, but you have almost no hope on enterprise governance.  This is why its high-fives all around for local governance, while making little progress on firm wide governance.

One thing that data-centric provides that makes the data governance issues tractable is incredible reduction in complexity.  Because governance is a human activity, getting down to human scales of complexity is a huge advantage.

Furthermore, to enjoy the benefits of data-centric you have to be prepared to share.  A traditional environment encourages copying of enterprise data to restructure it and adapt it to your own local needs.  Pretty much all enterprises have data on their employees.  Lots of data actually.  A large percentage of applications also have data on employees.  Some merely have “users” (most of whom are employees) and their entitlements, but many have considerably more.  Inventory systems have cycle counters, procurement systems have purchasing agents, incident systems have reporters, you get the pattern.

Each system is dealing with another copy (maybe manually re-entered, maybe from a feed) of the core employee data.  Each system has structured the local representation differently and of course named all the fields differently.  Some of this is human nature, or maybe data modeler nature, that they want to put their own stamp on things, but some of it is inevitable.  When you buy a package, all the fields have names.  Few, if any of them, are the names you would have chosen, or the names in your enterprise model, if you have one.

With the most mature form of data-centric, you would have one set of enterprise employee data.  You can extend it, but the un-extended parts are used just as they are.  For most developers, this idea sounds either too good to be true or too bad to be true.  Most developers are comfortable with a world they control.  This is a world of tables within their database.  They can manage referential integrity within that world.  They can predict performance within that world.  They don’t like to think about a world where they have to accept someone else’s names and structures, and to agree with other groups decision making.

But once you overcome developer inertia on this topic and you are actually re-using data as it is, you have opened up a channel of communication that naturally leads to shared governance. Imagine a dozen departments consuming the exact same set of employee data.  Not local derivations of the HR golden record, or the LDAP files, but an actual shared data set.  They are incented to work together on the common data.  The natural thing to happen, and we have seen this in mature organizations, is the focus shifts to the smallest, realest, most common data elements.  This social movement, and this focus on what is key and what is real, actually makes it easier to have common governance.  You aren’t trying to foist one applications view of the world on the rest of the firm, you are trying to get the firm to understand and communicate what it cares about and what it shares.

And this creates a natural basis for governance despite the fact that the scope became considerably larger.

Click here to read more on TDAN.com

The Data-Centric Revolution

This is the first of a regular series of columns from Dave McComb. Dave’s column, The Data-Centric Revolution, will appear every quarter. Please join TDAN.com in welcoming Dave to these pages and stop by often to see what he has to say.

We are in the early stages of what we believe will be a very long and gradual transition of corporate and government information systems. As the transition gets underway, many multi-The Data-Centric Revolutionbillion dollar industries will be radically disrupted. Unlike many other disruptions, the revenues currently flowing to information systems companies will not merely be allocated to newer more nimble players. Much of the revenue in this sector will simply evaporate as we collectively discover what a large portion of the current amount spent on IT is unnecessary.

The benefits will mostly accrue to the consumers of information systems, and those benefits will be proportional to the speed and completeness that they embrace the change.

The Data-Centric Revolution in a Nutshell

In the data-centric enterprise, data will be a permanent shared asset and applications will come and go. When your re-ordering system no longer satisfies your changing requirements, you will bring in a new one, and let the old one go. There will be no data conversion. All analytics that worked before will continue to work. User interfaces, names of fields, and code values will be similar enough that very little training will be required.

Click here to read more on TDAN.com

Click here to read Chapter 2 of Dave’s book, “The Data-Centric Revolution”

Human Scale Software Architecture

In the physical built world there is the concept of “human scale” architecture, in other words, architecture that has been designed explicitly with the needs and constraints of humans in mind:data model humans that are typically between a few feet and 7 ft. tall and will only climb a few sets of stairs at a time, etc.

What’s been discovered in the physical construction of human scale architecture is that it is possible to build buildings that are more livable and more desirable to be lived in, which are more maintainable, can be evolved and turned into different uses over time, and need not be torn down far short of their potential useful life. We bring this concept to the world of software and software architecture because we feel that some of the great tragedies of the last 10 or 15 years have been the attempts to build and implement systems that are far beyond human scale.

Non- human scale software systems

There have been many reported instances of “runaway” projects; mega projects and projects that collapse under their own weight. The much quoted Standish Group reports that projects over $10 million in total cost have close to a 0% chance of finishing successfully, with success being defined as most of the promised functions within some reasonable percent of the original budget.

James Gosling, father of Java, recently reported that most Java projects have difficulty scaling beyond one million lines of code. Our own observations of such mega projects as the Taligent Project, the San Francisco project, and various others, find that tens of thousands or in some cases hundreds of thousands of classes in a class library are not only unwieldy for any human to comprehend and manage but are dysfunctional in and of themselves.

Where does the “scale” kick in?

What is it about systems that exceeds the reach of humans? Unlike buildings where the scale is proportional to the size of our physical frames, information systems have no such boundary or constraint. What we have are cognitive limits. George Miller famously pointed out in the mid-fifties that the human mind could only retain in its immediate short-term memory seven, plus or minus two, objects. That is a very limited range of cognitive ability to hold in one’s short-term memory. We have discovered that the short-term memory can be greatly aided by visual aids and the like (see our paper, “The Magic Number 200+/- 50”), but even then there are some definite limits in the realm of visual acuity and field of vision.

Leveling the playing field

What data modelers found a long time ago, although in practice had a difficult time disciplining themselves to implement, was that complex systems needed to be “leveled,” i.e., partitioned in the levels of detail such that at each level a human could comprehend the whole. We need this for our enterprise systems now. The complexity of existing systems is vast, and in many cases there is no leveling mechanism.

The Enterprise Data Model: Not Human Scale

Take, for instance, the corporate data model. Many corporations constructed a corporate data model in the 1980s or 1990s. Very often they may have started with a conceptual data model which was then transformed into a logical data model and eventually found its way to a physical data model; an actual implemented set of tables and columns and relationships in databases. And while there may have been some leveling or abstraction in the conceptual and logical models, there is virtually none in the physical implementation. There is merely a partitioning which has usually occurred either by the accidental selection of projects or by the accidental selection of packages to acquire and implement.

As a result, we very often have the very same concept implemented in different applications with different names or sometimes a similar concept with different names. In any case, what is implemented or purchased very often is a large flat model consisting of thousands and usually tens of thousands of attributes. Any programmer and many users must understand what all or many of these attributes are and how are they used and how they are related to each other in order to be able to safely use the system or make modifications to it. Understanding thousands or tens of thousands of attributes is at the edge of human cognitive ability, and generally is only done by a handful of people who devote themselves to it full time.

Three approaches to taming the complexity

Divide and Conquer

One of the simplest ways of reducing complexity is to divide the problem down. This only works if after you’ve made the division you no longer need to understand the rest of the parts in detail. Merely dividing an ERP system into modules generally does not reduce the scope of the complexity that need to be understood.

Useful Abstraction

By abstracting we can gain two benefits. First there are fewer things to know and deal with, and second we can concentrate on behavior and rules that apply to the abstraction. Rather than deal separately with twenty types of licenses and permits (as one of our clients was doing) it is possible to treat all of them as special cases of a single abstraction. For this to be useful two more things are needed: there must be a way to distinguish the variations, without having to deal with the difference all the time; and it must be possible to deal with the abstraction without invoking all the detail.

Just in Time Knowledge

Instead of learning everything about a data model, withmodeproper tools we can defer our learning about part of the model until we need to. This requires an active metadata repository that can explain the parts of the model we don’t yet know in terms that we do know.

Written by Dave McComb