Why are scanners so slow?

Last week over lunch, my 18-year-old son Eli asked me, “Why are scanners so slow? They don’t even have as much to do as a copy machine. The copy machine has to move paper, put ink on the page. The scanner only has to scan.” He was referring to flatbed desktop scanners; we have a couple at work and one at home. I’m not sure where this observation came from, but he was right. My first reaction was to explain all the extra things the scanner is doing that a photocopier doesn’t do (allowing you to select the area you want scanned, de-skewing, scanning at different resolutions, optical character recognition, etc.). But as we talked I came to understand what’s wrong with these devices and why I don’t like them. As Eli said, they are unnecessarily slow. But they are also unnecessarily complex. Each of the ones we have has three or four buttons on the front. Then there is the user interface. You can tell (just from the panels, menus, forms and widgets) that a lot of work went into these UIs, but that doesn’t make them useful. I like to think I’m some sort of power user for most things related to PCs but I won’t invest the time needed to master this software just to scan one or two documents per month. I’ve found some combination of buttons and sequence of choices on the multiple screens that give me an acceptable outcome. Most of the time. God forbid someone uses it in between my occasional use, because then invariably the settings are changed and it takes me about as long to find the scanned document in my file system as it would to type it. So Eli and I started designing the desktop scanner of the future. Scanner manufacturers: you may have this design free of charge (send me a prototype if you’d like). The scanner has one button: “Scan.” (We went through a few designs where you could pick resolution or color, but as you’ll see those distinctions aren’t worth making). This business of “warming up” is pretty lame; just have an on/off switch. When you turn it on, it should warm up. So with the machine warm, you hit the “Scan” button and a screen pops up on the desktop with the image in full color in the highest resolution the machine is capable of. (I can hear the engineers saying, “It’s so wasteful to scan the whole flatbed if all you want is a photo or receipt,” or “It’s wasteful to scan in a higher resolution than you need.” Get over it. We’ll waste a few machine cycles to save some real time.) The desktop application is just the image with a simple menu: you can save the image anywhere you’d like in any of dozens of well known formats. You can print it. And you can do some pretty standard image manipulation such as clipping, reducing resolution, or adjusting color. Then you could pipeline the image over to some other program (either included with the scanner or whatever you have installed; imagine, “send to…” Photoshop or an OCR program or your email as an attachment). That’s it. This would be an incredibly more useful product, cheaper and easier to build and program. This reminded me, I had the same reaction to the user interface for the digital dictation device I had. I know they spent a lot on the UI. It would have been more useful if at the moment the device was plugged into the USB it just appeared in the file explorer window. You might be wondering why I’m writing this in our SOA blog. The first reason is a desire to make the world a better place. I would buy another scanner like this even though I already have three, because of the ease of use I’d get from it. But there is a broader message, one that echoes our credo at Semantic Arts: Let’s start taking complexity out of our systems. Most services and most applications get worse as more “features” are added. Our prescription for software: figure out the essential raison d’être of each service, get that right, and leave out all the other crap. This will work for devices, too. Eli will thank you for it.

Location and navigation in computer systems

I’ve been working a lot lately with Semantic Web technologies. In particular I’ve been reflecting on the profound impact of basing everything on URIs. At one level it doesn’t look much different from primary keys or universal ids or GUIDs, but at a number of levels it is quite different. I might talk about that difference in this article, but for now I want to talk about how systems look different after you’ve been marinating in these concepts for a while and, in particular, how we rely too much on location in our systems. Let’s start with your typical desktop operating system. Let’s say you want to change a classpath variable so you can run a particular program (from anywhere on your computer!). For starters, the classpath variable is just a variable to tell the operating system where something is. (You still have to identify the program you want to run, but this makes its name slightly shorter.) Imagine that your computer was (or had) a big triple store instead of the anachronistic hierarchical file folder system we’ve come to know but not really love. We would identify our program by its URI, refer to it by the URI, and not really care where it was as long as we could get to it. (This is making your desktop look a lot more like the web rather than vice versa.) “But wait!” you say, “What if I want two different versions of the same program?” This is one of those great abuses of the file system that we’ve gotten so used to that it doesn’t even seem perverse. (We do the same thing with our documents and spreadsheets: put them over here and they mean something different or are different versions even though they have the same name.) If they really were two different versions they would have two different URIs. If you want to keep track of which one is newer, etc., this would be far easier with some triples than “knowing” which folder is more up to date. The more amusing part of this location-based identity thing is the other half of the classpath problem I started with, not what it is (a pointer to a location where you can find something else) but where it is (where is the classpath variable itself). “Well, that’s easy. You just click on the ‘start’ icon, click on the Control Panel menu choice, select ‘System and Security’ then choose the ‘System” bringing up a new window (basic information about your computer). Over in the left hand menu, there it is: ‘Advanced System Setting.’ Another window pops up ‘System Properties.’ Just go to the ‘Advanced’ tab and push the ‘Environment Variables’ (another pops up), scroll down to find the variable you want to set, click the ‘Edit’ button, and there it is!” (Why is it that all the variables you want to change are behind at least two and occasionally four layers of “advanced”?) In the first place, I have no idea where it is. All I know is a navigation procedure through a bunch of UIs that will lead me to it. This is one of the simpler ones. I’ve been dealing with this kind of stuff a lot lately and have come to the conclusion that UI designers are frustrated game developers. The simplest thing involves a journey through several rooms (forms), down all kinds of hallways (buttons and menus), and picking up all kinds of magic feathers (click this check box in order to get the button enabled). Game developers (sorry, I mean application developers — I’m getting them more and more mixed up the more I use applications) compete on trying to make the navigation “more intuitive.” The enduring popularity of “classic” interfaces is a testament to how well this isn’t working. Maybe we need this multi-layered, multi-display type interaction to manage our way through the vast complexity. But maybe search and meaningful names would help. Maybe the UIs could be reduced to a tutorial for the first time through and then be done. What if I had the URI of the classpath variable? I could just set it. No navigation required. Of the several thousand settable variables (think about it — there are over a hundred just on my print driver), I regularly use maybe a dozen. How cool it would be to have a docked window with those dozen variables I change a lot, and the ability to click and change them. And a search box to find the ones I don’t use as often. Until that day, if you happen to see me wandering the halls of “Control Panel,” don’t forget to give me the sword that gives me three new lives. You never know when you might need them.

Bob DuCharme’s book: Learning SPARQL

I was hoping I wasn’t going to have to learn SPARQL 1.1 from the specs. Bob DuCharme’s book Learning SPARQL 1.1  arrived just in time to save me from that fate. The book is well organized, progresses well and has great examples. What I particularly like and what you don’t get in the specs, are the little insights, suggestions and gotcha’s that come from someone who has used this a lot. The only disappointment from my standpoint was his punting on the NOT versus MINUS distinction, where after a single example of both that return equivalent results, he sends the readers back to the spec if they are interested in the subtle difference. I am interested in the subtle difference, but I had hoped I could get it a bit better explained than the terse spec. All in all, excellent work, would recommend it to anyone picking up SPARQL.

DIY Software Applications

CIO magazine had an article this month “Why CIOs Still Like Do It Yourself  Software Development”  and while the article wasn’t terribly compelling I do think it makes what is about to become inevitable, acceptable. We’ve spent the last couple of decades convincing ourselves that application development is hard.  Yes there is plenty of evidence, and more than enough projects gone bad that it’s pretty easy to come to that conclusion.  And yes, good application software developers are expensive to come by. But, in my opinion, most of the pieces are in place for a renaissance in the business.  I think within the next few years we’ll see the equivalent of visicalc for transactional systems, except that it will be robust, scale and be secure. Stay tuned.

The future of software: Ditch the Stack

Most software projects start with an architecture. And most architectures are “stacks” as in “this is what our stack looks like.” This is where middleware, tools, languages and the like get decided. Two interesting things happen here.

The first is the “platform wars.” Vendors of middleware and tools are very interested in which stack gets adopted, since that is what generates their revenue. As a result they and their “ecology” (the other vendors who are dependent on the platform) promote the virtues of their “stack” over the other available options. By the way, there aren’t one or two “stacks” but there are some families of stacks, for instance Java and .NET have each engendered stacks, but there are variations on stacks due to databases, or functional components such as ESRI’s ARC GIS or Primavera’s Project Management Suite. The result of these wars is to be very exclusionary. Once you select J2EE you won’t be hiring a lot of .NET programmers.

The second interesting thing is “stack dependencies.” The stack looks so neat. You pretty much accept the whole thing. One of the things that caused it to win over the other stacks was “integrated functionality.” However, it’s this integration that causes lock-in and all kinds of other problems. Maybe you’ve selected .NET and ADO.NET as part of your stack. Now you probably have MS SQL, and initially it’s not a hard dependency, but you start to rely on the integration and pretty soon you’re locked in. This tight integration within a stack actually gets in the way of integration across stacks (between your applications).

Now, I’m not suggesting you build applications without middleware tools or languages (that’s even too good for me to believe). What I am suggesting is in as many places as possible your stack says “Don’t Care.” As in “I don’t care what you do in this part of the stack.” If you follow through with this you’ll get flexibility at a pretty low premium.

Foxconn’s getting 1 million robots

I just saw an article that Foxconn (China’s largest private employer, and manufacturer of among other things the iPhone) has unveiled a plan to install 1 million robots in its assembly plant.  (They currently employ 1 million employees) What’s wrong with this picture?  What’s wrong is America isn’t implementing a factory with 1 million robots.  Does anyone think Chinese robots will work for less than American robots?  That they’ll work harder? But we’ve lost out.  I don’t see how we’ll catch up at this point.

Refactoring the Law

Just read an interesting article: Refactoring the Law: Reformulating Legal Ontologies, by Garret Wilson, which was quite interesting. Wilson, presumably a nerd turned lawyer or vice versa, makes the case that the understanding and practice of law have been evolving in that style of punctuated equilibrium that those of us in software development call “refactoring.”  That is the law too, creates categories, accretes  case law onto those categories until the number of special cases and exceptions overwhelm things at which time the pile gets turned again and we have a new way of thinking about law. He leads off with an example of how in the 1850’s courts had developed case law around liability and what was considered to be “inherently dangerous” items.  By the turn of hte 1900’s the court had categorized as “dangerous in themselves” a loaded gun, mislabeled poison, defective hair wash and “not dangerous” as a defective carriage or a defective boiler.  In MacPherson v. Buick the court finally had to decide whether a horseless carriage was morelike a locomotive (dangerous) or carraige (not dangerous) and the whole categorization scheme got reshuffled. He goes on in a wide ranging set of analogies tracing back to the greek and roman models of law and forming parallels with procedural, object oriented and agile software development methodologies. Fun read and educational.

Report from the Ontology Summit

Nearly 100 people in the international ontology community met this past April 18-19 at the sixth annual Ontology Summit to discuss “making the case for ontology.” In recent years the number of deployed ontologies has increased dramatically, yet the technology is still very niche and poorly understood outside of the community. The goal of this year’s summit was to assist technology evangelists in communicating the message by providing concrete application examples, success/value metrics and advocacy strategies. The key output is a communique and corresponding talk. The main messages are: 1) ontology is about clarifying meaning and supporting inference; 2) key value propositions are shared understanding, reduced complexity, flexibility and interoperability; and 3) ontology is ready for prime time. Go forth and ontologize. Mike Uschold

Adding Women to a Group Makes the Group Smarter

There was an article in this month’s Harvard Business Review “What Makes a Team Smarter? More Women” “> The methodology of the study was they measured IQs of individuals and then sometimes randomly and sometimes not so randomly assigned them to groups and then had the groups attempt to complete a task, which was meant to challenge the collective wisdom of the group. In most cases, as they added more women to a group the groups collective intelligence went up. Correlation of women to group performance

Zen mind

Part 1 & Part 2

We just conducted a weeklong training session on OWL/DL and Ontology Engineering.  Several of the participants will be attending the Semantic Technology Conference, and felt they will be getting a lot more out of the conference, because of the training.  On drilling down a bit further, we found that the main benefit in this regard was breaking down their pre-conceived ideas of what semantics is.  They were several days into the training before they were deprogrammed enough to completely follow what was going on.

In this blog, and perhaps the next couple, I want to summarize some of these preconceptions, and some ideas that will at least make you aware of them, and may help you get more out of the conference, or any other studying you may be doing in the area.  We call this “Zen Mind” from the Zen masters belief that to really learn you have to get as many of your preconceived ideas out of your head long enough to establish some new patterns.  I believe the Zen Masters called it “beginners mind” (perhaps they thought Zen Mind was too promotional).

In that spirit, let us offer up some preconceived ideas and the “koans” that seem to best address them.

Preconceived idea #1: Properties belong to Classes

People from a relational background make the partially correct analogy between relational attributes and semantic datatype properties and between foreign key relationships in relational and object properties in semantics.  However, this analogy will bite you.  Repeatedly, as our students demonstrated.

They had a tough time remembering that the same property can be associated with many different classes.  They were so used to each property being unique, that when they did associate the same property with more than one class, they gave it different names (locatedIn, became locatedInState, locatedInCountry etc).

The koans we decided were most useful in this case were two:

• Classes are really “sets” (to help get past the idea that classes are some sort of template, as they are in relational and Object Oriented technologies.  This seems to help overcome the temptation to believe that the property belongs to the class)

• Properties own classes (when you define a restriction class in OWL/DL, what you have really done is use a pre-existing property to create a set of instances that have “someValues” from that property.  It is the property that gives rise to the class, and therefore is more useful to think of the properties owning the classes – at least compared to the classes owning the properties)

So, if you find yourself relapsing into relational thinking, just repeat the two koans until the symptoms disappear.

Multiple Inheritance v. Multiple Classification

Koan: MI is almost always Intersection

Koan: MI makes sets smaller, does not make capabilities larger

Preconceived Idea # 2: Multiple Inheritance

If you come from an Object Oriented background, in particular one that supports multiple inheritance, you might find an apparent similarity between multiple inheritance in OO and having a class be subsumed by two others.  However, if you try this out, you’ll realize you’re not getting what your expecting.  This is because the semantics are different.  In OO there are really two things going on at the same time: subtyping and inheritance.  The inheritance piece is giving you properties from both of your parents. If one parent had the “foo()” method and the other parent had the “bar()” method, the child now has both.  The child has all of the attributes, and all of the behaviors of both parents.  The child is essentially the union of the behaviors of the two parents.  Semantics is not dealing with behavior, it’s dealing with typing, membership and classification.

So, take a couple of koans and call me in the morning:

Subclassing from two classes makes you the intersection, not the union of the two If a class A is a subclass of class B and class C, all members must be members of both parents.  This is the intersection of the two parents, not the union.  It is really a subclass of the intersection, but we’ll do that on another post.

Multiply classify an instance – The power in semantics lies in the ability to classify an instance multiple ways.  This gets at what most OO people want to do with MI, and it’s far more flexible.