Are there hidden problems with default values in software?
Virtually all information systems have “default values.” We put them in our systems to make things easier for the end-users as well as the system itself. As we will investigate in this white paper, there are many hidden problems with default values, some of which first surface in the edge cases. But as we began to reflect on them we realize that these problems infest all of our systems.
Default values and data capture
Most of our use of default values is tied in with the capture of data. If we’re capturing data from another electronic source we may know categorically that everything from this source gets a transaction code B-17. In that case, B-17 is a default value for transactions from the source. This rarely creates a problem. The area where default values create problems is exactly the area where they create benefits. Default values are very often used as an aid to data entry. If we can provide data to users that is most often the data that they would have entered, then we would have done them a favor and sped up data entry.
The most interesting distinction is between default values that a user actually saw and therefore could have changed; might have seen; or did not see. As we transition from mainframe to web-based systems, and as we adopt more portal-oriented technology for our front end, we will very often have the possibility where we don’t know whether the end-users saw and agreed with an individual piece of default data because construction of a user interface may have excluded it, may have put it on a tab they didn’t visit, may have put it in a scroll region they didn’t see. We have to keep these considerations in mind because of the implication of a user accepting a default value. Let’s take a look at some of the types of these defaults and where they might run amiss.
One of the most useful defaults that a system can provide is a current date. This may be used as a default for when a piece of information was posted to the system, in which case by definition a default is almost always exactly right. We can also use a current date as a default value of when an event occurred. This is a useful default if events are being captured nearly contemporaneously. The next type of default is the most commonly occurring category. Maybe our portfolio management system categorizes information assets as being either applications or infrastructure. Since we have a lot more applications than infrastructure it may make sense to default that field to applications. It will save the users, on average, a fair amount of time not to fill than in. The third major type of default values are those that come from implicit clones. An implicit clone is a new record based on either existing incidents or values that are deemed to be valid or representative. So in QuickBooks, every time we enter another credit card charge it assumes it will be the same credit card company and the same date as the last one.
Where the problem lies
The problem comes when we analyze the data. Did the user actually supply the data or did they make use of the fact that it was a default and therefore let it go through? For instance, let’s say we asked about eye color in our database. And let’s say that statistically 50% of the population has brown eyes and therefore we make that our default value. We also know that statistically 10% of the population has blue eyes. As we later do a survey of the data in our database we find that only five percent of our population has blue eyes and 70% has brown eyes. How much can we trust this data? Or was it that the data entry person had discovered that it was easier to go with the default and there didn’t seem to be much downside?
The use of default values is a two-edged sword. While it offers some convenience on data entry, as we begin building more and more interesting systems we may find that the presence of default values may confuse us as much as help us if it increases our uncertainty about what the user really did.