Data-Centric vs. Application-Centric

Data-Centric vs. Software Wasteland

Dave McComb’s new book “Software Wasteland: How the Application-Centric Mindset is Hobbling our Enterprises” has just been released.

In it, I make this case that the opposite of Data-Centric is Application Centric, and our preoccupation with Application-Centric approaches over the last several decades has caused the cost and complexity of our information systems to be at least 10 times what they should be and in most cases we’ve examined, 100 times what they should be.

This article is a summary about how diametrically opposed these two world views are, and how the application-centric mind set is draining our corporate coffers.

An information system is essentially data and behavior.

On the surface, you wouldn’t think it would make much difference with which one you started with if you need both and they feed off each other.  But it turns out it does make a difference.  A very substantial difference.

Screen Shot 2018-03-04 at 10.20.54 PM

What does it do?

The application-centric approach starts with “what does this system need to do?” Often this is framed in terms of business process and/or work flow.  In the days before automation, information systems were work flow systems.  Humans executed tasks or procedures.  Most tasks had prerequisite data input and generated data output.  The classic “input / process / output” mantra described how work was organized.

Information in the pre-computer era was centered around “forms.”  Forms were a way to gather some prerequisite data, which could then be processed.  Sometimes the processing was calculation.  The form might be hours spent and pay rate, and the calculation might be determining gross pay.

These forms also often were filed, and the process might be to retrieve the corresponding form, in the corresponding (paper) file folder and augment it as needed.

While this sounds like ancient history, it persists.  If you’ve been the doctor recently, you might have noticed that despite decades of “Electronic Medical Records,” the intake is weirdly like it always has been: paper form based.

This idea that information systems are the automation of manual work flow tasks continues.  In the Financial Service industry, it is called RPA (Robotic Process Automation) despite the fact that there are no robots.  What is being automated are the myriad of tasks that have evolved to keep a Financial Services firm going.

When we automate a task in this way, we buy into a couple of interesting ideas, without necessarily noticing that we have done so.  The first is that automating the task is the main thing.  The second is that the task defines how it would like to see the input and how it will organize the output.  This is why there are so many forms in companies and especially in the government.

The process automation essentially exports the problem of getting the input assembled and organized into the form the process wants.  In far too many cases this falls on the user of the system to input the data, yet again, despite the fact that you know you have told this firm this information dozens of times before.

In the cases where the automation does not rely on a human to recreate the input, something almost as bad is occurring: developers are doing “systems integration” to get the data from wherever it is to the input structures and then aligning the names, codes and categories to satisfy the input requirements.

Most large firms have thousands of these processes.  They have implemented thousands of application systems, each of which automates anywhere between a handful and dozens of these processes.  The “modern” equivalent of the form is the document data structure.  A document data structure is not a document in the same way that Microsoft Word creates a document. Instead, a document data structure is a particular way to organize a semi-structured data structure.  The most popular now is json (javascript object notation).

A typical json document looks like:

{‘Patient’: {‘id’: ‘12345’, ‘meds’: [ ‘2345’, ‘3344’, ‘9876’] } }

Json relies on two primary structures: lists and dictionaries.  Lists are shown inside square brackets (the list following the work ‘meds’ in the above example).  Dictionaries are key / value pairs and are inside the curly brackets.  In the above ‘id’ is a key and ‘12345’ is the value, ‘meds’ is a key and the list is the value, and ‘Patient’ is a key and the complex structure (a dictionary that contains a both simple values and lists) is the value.  These can be arbitrarily nested.

Squint very closely and you will see the document data structure is our current embodiment of the form.

The important parallels are:

  • The process created the data structure to be convenient to what the process needed to do.
  • There is no mechanism here for coordinating or normalizing these keys and values.

Process-centric is very focused on what something does.  It is all about function.

Click here to continue reading on

Scroll to top
Skip to content