The scope of an application is one of those topics that seems to be quite important and yet frustratingly slippery.
There are several very good reasons for its seeming importance. One, the scope of an application eventually determines its size and, therefore, the cost to develop and/or implement it. It’s also important in that it defines what a given team is responsible for and what they have chosen to leave undone or be left for others to complete. But it’s slippery because our definitions of scope sound good; we have descriptive phrases about what is in and what is out of scope.
But as a project progresses, we find initially a trickle and eventually a tidal wave of concerns that seem to be right on the border between what is inside and outside scope. Also, one of the most puzzling aspects of an application’s scope is the way a very similar scope description for one organization translates into a drastically different size of implementation project when applied to another organization. In this paper, we explore what scope really is, the relationship between it and the effort to implement a project, and how an analogy to fractal geometry will have some bearing and perhaps be an interesting metaphor for thinking about application scope issues.
What is Fractal Geometry?
Fractal geometries are based on recursive functions that describe a geometrical shape. For instance, in the following illustrations we see a simple geometric shape, a triangle. In the second figure, we see a shape that’s been defined by a function which says, “in the middle of each side of the triangle put another triangle; and in the middle of the side of that triangle put another, and yet another, etc.” We’ve all seen the beautiful, multicolored, spiral shapes that have been created with the Mandelbrot sets. These are all variations of the fractal geometry idea. There are several things of interest about fractal geometry besides their aesthetic attractiveness. The main thing to note in this example, and in most others like this, is that while the contained area of the shape grows a little bit with each iteration of the function or with each increase in resolution of the shape, the perimeter continues to grow at a constant rate. In this case, with each iteration, the perimeter gets 33% longer. You can see that on any one side which was three units long, we could remove the middle unit and replace it with two equal length units. So we now have four thirds of the perimeter on that one side; this is repeated on all sides, and is repeated at every level. If we were to continue to run the recursion, the detail would get finer and finer until the resolution would be such that we would no longer be able to see all the details of the edges. And yet, as we magnify the image we would see that in fact the detail was there. Fractal geometricians sometimes say that Britain has an infinitely long coastline. The explanation is that if you took a low resolution map of Britain and measured the coastline you would get a particular figure. But as you increase the resolution of your map, you find many crags and edges and inlets that the previous resolution had omitted. If you were to add them up, the total coastline would now be greater. And if you went to an even higher resolution, it would be greater again and supposedly ad infinitum. So as we see from this, the perimeter of a fractal shape is a matter of the resolution that we use to observe and measure it. Hold that thought.
When we talk about the size of application, we are usually concerned with some aspect of that application that could change in such a way that our effort to develop or implement it would change. For instance, a common size metric for applications has been number of lines of code. A system with one million lines of code is considered in many ways ten times as large as one with 100,000 lines of code. That is a gross simplification but let’s start there. It would take ten times as long to write one million lines of code as it would to write 100,000 lines of code.
Similarly, it would take ten times as long, on average, to modify part of a system that has one million lines of code as opposed to one with 100,000 lines of code. But lines of code was a reasonable metric only when most of our systems were written by hand, in procedural code, in languages that were at least comparable in their expressiveness.
Nowadays, a given system will be built of handwritten code plus generated code plus graphically designed components, style sheets, inherited code, libraries, etc. The general concepts of size and complexity still exist, but we are more likely to find the complexity in some other areas. One of the main areas where we find complexity in information systems is the size and variety of the user interfaces. For instance, a system with 100 different user interfaces can be thought of as one-tenth as complex as one with 1000 different user interfaces.
This relationship held with the lines of code metric in the old days because each screen was written by hand; the more screens, the more lines of code. Nowadays, although the connection between the screens and the lines of code may not be as valid because much of the code is generated, there is still ten times as much complexity in designing the user interfaces, laying them out, discussing them with users, etc. Other metrics that tend to indicate larger sized applications are APIs (application programming interfaces).
An API is a function call that the system has made public and can be used by other systems. An application API could be almost anything. It could be get_ quote (order_number) or it could be post_ ledger (tran). The size of applications that support APIs tends to be proportional to the number of APIs they have published. This is partly because each API must have code behind it to implement the behavior described, and partly because the act of creating the signature and the interface requires definition, design, negotiating with other system users, and maintaining and upgrading these interfaces.
Finally, one other metric that tends to move proportionally with the size of application systems is the size of their schema. The schema is the definition of the data that is being maintained by the application. Classically, it is the tables and columns in the database. However, with object oriented or XML style systems, the schema will have a slightly different definition.
An application for a database with 10,000 items in its schema (which might be 1,000 tables and 9,000 columns in total), all other things being equal, will very often be ten times as complex as an application with 1,000 items in its schema.. This is because each of those individual attributes is there for some reason and typically has to be moved from the database to a screen, or an API, or an algorithm, and has to be moved back when it’s been changed, etc.
Surface Area v Volume
Now, imagine an application as a shape. It might be an amorphous amoeba-like shape or it might be a very ordered and beautiful snowflake-like shape that could have been created from fractal geometry. The important thing to think about is the ratio of the surface area to the volume of the application. My contention is that most business applications have very little volume. What would constitute volume in a business application would be algorithms or rules about what is allowed and permissible. And while it may first seem that this is the preponderance of what you find when you look at a business application, the truth is far from that.
Typically, 1 or 2% of the code of a traditional business application is devoted to algorithms and/or true business rules. The vast majority is dealing with the surface area itself. To try another analogy, you may think of a business system as very much like a brain, where almost all the cognitive activity occurs on the surface and the interior is devoted to connecting regions of the surface. This goes a long way towards explaining why the surface of the brain is a convoluted and folded structure rather than a smooth organ like a liver.
Returning to our business application, let’s take a simple example of an inventory control system. Let’s say in this example that the algorithms of the application (the “volume”) have to do economic order quantity analysis, in other words, calculate how fast inventory is moving and determine, therefore, how much should be reordered and when. Now, imagine this is a very simple system. The surface area is relatively small compared to the volume and would consist of user interfaces to process receipts, to process issues, perhaps to process adjustments to the inventory, and a resultant inventory listing.
The schema, too, would be simple and, in this case, would have no API. Here is where it becomes interesting. This exact same system, implemented in a very small company with a tiny amount of inventory, would be fairly small and easy to develop and implement. Now take the same system and attempt to implement it in a large company.
The first thing you’ll find is that whatever simplistic arrangement we made for recording where the inventory was would now have to be supplemented with a more codified description, such as bin and location; for an even larger company that has automated robotic picking, we might need a very precise three-dimensional location.
To take it further, a company that stores its inventory in multiple geographic locations must keep track of not only the inventory in each of these locations but also the relative cost and time to move inventory between locations. This is a non-issue in our simplistic implementation.
So it would go with every aspect of the system. We would find in the large organization enough variation in the materials themselves that would warrant changes in the user interfaces and the schema to accommodate them, such as the differences between raw materials and finished goods, between serialized parts and non-serialized parts, and between items that do and do not have shelf life. We would find the sheer volume of the inventory would require us to invent functionality for cycle counting, and, as we got large enough, special functionality to deal with count errors, to hold picking while cycle count was occurring, etc.
Each of these additions increases the surface area, and, therefore, the size of the application. I’m not suggesting at all that this is wasteful. Indeed, each company will choose to extend the surface area of its application in exactly those areas where capital investment in computerizing their process will pay off. So what we have is the fractal shape becoming more complex in some areas and not in others, very dependent on the exact nature of the problem and the client site to which it is being implemented.
Once we see an application in this light, that is, as a fractal object with most of its size in or near its perimeter or its surface, we see why the “same” application implemented in organizations of vastly different size will result in applications of vastly different size.. We also see why a general description of the scope of an application may do a fair job of describing its volume but can miss the mark wildly when describing its surface area and its perimeter.
To date, I know of no technique that allows us to scope and estimate the surface area of an application until we’ve done sufficient design, nor to analyze return on investment to determine just how far we should go with each aspect of the design. My intention with this white paper is to make people aware of the issue and of the sensitivity of project estimates to fine details that can rapidly add up.