White Paper: How long should your URIs be?

This applies to URIs that a system needs to generate when it finds it needs to mint a new resource.

I’ve been thinking a lot about automated URI assignment lately. In particular the scheme we’ve been using (relying on the database to maintain a “next available number” and incrementing that), is fraught with potential problems. However I really don’t like the guid style with their large unwieldy and mostly unnecessarily long strings.

I did some back of the envelop thinking and came up with the following recommendations. After the fact I decided to search the web and see what I could find. I found some excellent stuff, but not this in particular, nor anything that seemed to rule it out. Of note, Phil Archer has some excellent guidelines here: http://philarcher.org/diary/2013/uripersistence/. This is much broader than what I’m doing here, but it is very good. He even has “avoid auto increment” as one of his top 10 recommendations.

The points in this paper don’t apply to hand crafted URIs (as you would typically have for your classes and properties, and even some of your hand curated special instances). This applies to URIs that a system needs to generate when it finds it needs to mint a new resource. A quick survey of the approaches and who uses them:

  • Hand curate all—dbpedia essentially has the author create a URI when they create a new topic.
  • Modest-sized number—dbpedia page IDs and page revision IDs look like next available number types.
  • Type+longish number—yago has URIs like yago:Horseman110185793 (class plus up to a billion numbers; not sure if there is a next available number behind this, but it kind of looks like there is).
  • Guids—cyc identifies everything with a long string like Mx4rvkS9GZwpEbGdrcN5Y29ycA.
  • Guids—Microsoft uses 128 bit guids for identifying system components, such as {21EC2020-3AEA-4069-A2DD08002B30309D}. The random version uses 6 bits to indicate random and therefore has a namespace of 1036, thought to be large enough that the probability of generating the same number is negligible.

Being pragmatist, I wanted to figure out if there is an optimal size and way to generate URIs.

Download the White-paper

White Paper: Ontologies and Application

Outlining three and a half ways that applications can have their schemas derived from enterprise ontologies.

Many people (ok, a few people) have asked us: “what is the relationship between an ontology and an application?” We usually say, “That’s an excellent question” (this is partly because it is, and partly because these ‘people’ are invariably our clients). Having avoided answering this for all this time we finally feel motivated to actually answer the question. It seems that there are three (ok three and a half) ways that ontologies are or can be related to applications. They are:

  • Inspiration
  • Transformation
  • Extension

But, I fail to digress… Let’s go back to the ‘tic tac toe’ board. We call the following a ‘tic tac toe’ board, because it looks like one:

Outlining three and a half ways that applications can have their schemas derived from an enterprise ontology… Ontologies and Applications 2
Ontologies and Applications 2

What it is attempting to convey is that there are levels of abstraction and differences in perspective that we should consider when we are
modeling. An application is in the lower middle cell.

Data models are in the middle square. Ontologies could be anywhere. An ontology is a formal way of representing a model. And so we could have an ontology that describes an application, an ontology of a logical model, even ontologies of data or meta meta data.

In our opinion the most interesting ontologies are in the middle top: these are ontologies that represent concepts independent of their implementation. This is where we find upper ontologies as well as enterprise ontologies.

Now some companies have built enterprise wide conceptual models. The IRS has one, with 30,000 attributes. But all the ones we’ve seen are not actually in the top center cell, they are logical models of quite wide scope. Ambitious and interesting, but not really conceptual models and typically far more complex than is useful. What we’ve found (and written about in other articles (ref the Elegance article)) is that a conceptual model can cover the same ground as a logical model with a small percentage of the total number of concepts. Not only are there fewer concepts in total, there are few concepts that need to be accepted and agreed to.

Want to Read More? Download the White-paper.

Written by Dave McComb

White Paper: Six Axes of Decoupling

Loose coupling has been a Holy Grail for systems developers for generations.

The virtues of loose coupling have been widely lauded, yet there has been little description about what is needed to achieve loose coupling. In this paper we describe our observations from projects we’ve been involved with.

Coupling

Two systems or two parts of a single system are considered coupled if a change to one of the systems unnecessarily affects the other system. So for instance, if we upgrade the version of our database and it requires that we upgrade the operating system for every client attached to that database, then we would say those two systems or those two parts of the system are tightly coupled. Coupling is widely understood to be undesirable because of the spread of the side effects. As systems get larger and more complex, anything that causes a change in one part to affect a larger and larger footprint in the entire system is going to be expensive and destabilizing.

Loose Coupling/Decoupling

So, the converse of this is to design systems that are either “loosely coupled” or “decoupled.” Loosely coupled systems do not arise by accident. They are intentionally designed such that change can be introduced around predefined flex points. For instance, one common strategy is to define an application programming interface (API) which external users of a module or class can use. This simple technique allows the interior of the class or module or method to change without necessarily exporting a change in behavior to the users.

Loose coupling has been a Holy Grail for systems developers for generations.

The Role of the Intermediate

In virtually every system that we’ve investigated that has achieved any degree of decoupling, we’ve found an “intermediate form.” It is this intermediate form that allows the two systems or subsystems not to be directly connected to each other. As shown in Figure (1), they are connected through an intermediary. In the example described above with an API, the signature of the interface is the intermediate.

Click here to Download the White-paper

White Paper: Veracity

Encarta defines veracity as “the truth, accuracy or precision of something” and that seems like a pretty good place to start.

Our systems don’t model uncertainty very well, and yet that is exactly what we deal with on a day-to-day basis. This paper examines one aspect of modeling certainty, namely veracity, and begins a dialog on how to represent it.

Veracity

Encarta defines veracity as “the truth, accuracy or precision of something” and that seems like a pretty good place to start. In our case we will primarily be dealing with whether a symbolic representation of something in the real world faithfully represents the item in the real world. Primarily we are dealing with these main artifacts of systems:

  • Measurements – is the measurement recorded in the system an accurate reflection of what it was meant to measure in the real
    world?
  • Events – do the events recorded in the system accurately record what really happened?
  • Relationships – do the relationships as represented in the system accurately reflect the state of affairs in the world?
  • Categorization – are the categories that we have assigned things to useful and defensible?
  • Cause – do our implied notions of causality really bear out in the world? (This also includes predictions and hypotheses.)

Only the first has ever received systematic attention. Fuzzy numbers are a way of representing uncertainty in measurements, as is “interval math” and the uncertainty calculations used in Chemistry (2.034 +/ -.005 for instance).

But in business systems, all of these are recorded as if we are certain of them, and then as events unfold, we eventually may decide not only that we are not certain, but that we are certain of an opposite conclusion. We record an event as if it occurred and until we have
proof that it didn’t, we believe that it did.

Download the White-paper

White Paper: International Conference on Service Oriented Computing

In this write up I’ll try to capture the tone of the conference, what seemed to be important and what some of the more interesting presentations were.

This was the first ever Conference on Service Oriented Computing.  In some ways it was reminiscent of the first Object OrientedService Oriented Computing conference (OOPSLA in 1986): highly biased toward academic and research topics, at the same time shining a light on the issues that are likely to face the industry over the next decade.  In this write up I’ll try to capture the tone of the conference, what seemed to be important and what some of the more interesting presentations were.

Why Trento?

Apparently, a year and a half ago several researchers in Service Oriented Computing began planning an Italian conference on Service Oriented Computing and it kind of spun up to an International conference. Trento was an interesting, but logistically difficult choice.  Trento is in Dolomite region of the Italian Alps, and is difficult even for Europeans to get to.  It is a charming University town, founded in the roman times with a rich history though the middle ages.  The town is a beautiful blend of old and new and very pedestrian friendly, large cobblestone courtyards can be found every few blocks usually adjoining a renaissance building or two.  We took a side trip one hour further up the Alps to Balzano, and saw Otzi, “the ice man.”

This conference had some of the best after hours arrangements of any I’ve attended: one night we got a guided tour of “the castle” followed by a brief speech from the vice mayor and wine and dinner sized hors d’odourves.  The final night was a tour of Ferrari Spumante the leading producer of Italian Champagne, with a five or six course sit down dinner.

Attendees & Presenters

There were about 140 attendees, at least a third of who were also presenters. All but eight were from academia, and we were among 6 who were from North America.  Next Years venue will be in New York City in mid November, which should change the nature and size of the audience considerably.

Keynotes were by Peter Diry, who is in charge of a large European government research fund that is sinking billions into research topics in advanced technology.  There was a great deal of interest in this as I suspect many of the attendees bread was buttered either directly or indirectly from these funds.  Bertrand Meyer was the pre dinner keynote the night of the formal dinner.  Had a very provocative talk on the constructs that are needed to managed distributed concurrency (we’ve managed to avoid most of this in our designs, but you could certainly see how with some designs this could be a main issue.)  Frank Heyman from IBM was the final keynote, which was primarily about how all this fits into Grid computing and open standards.

The 37 major presenters and 10 who had informal talks at a wine and cheese event, were chosen from 140 submissions.  Apparently many of these people are leading lights in the research area of this discipline, although I had never heard of any of them. In addition there were two half day tutorials on the first day. Presentations were in English, although often highly accented English.

General Topics

It was a bit curious that the conference was “Service Oriented Computing” and not “Service Oriented Architecture” as we hear it; it marked some subtle and interesting distinctions.  This was far more about Web services than EAI or Message Oriented Middleware.  They were far more interested in Internet scale problems than enterprise issues.

Some of the main themes that recurred throughout the conference were: service discovery and composition, security, P2P and grid issues and Quality of Service issues.  Everyone has pretty much accepted WSDL and BPEL4WS (which everyone just calls “bee pell”) as the defacto technologies that will be used.  There was some discussion and reference to the Semantic Web technologies (RDF, DAML-S and OWL).  They seemed to be pretty consistent on the difference between Orchestration and Choreography (more later)

There was a lot of talk about dynamic composition, but when you probed a bit, not as much agreement as to how far it was likely to go or when the dynamic-ness was likely to occur.

Things clarified for me

There were several things that weren’t necessarily presented in a single talk, but with the combination and context several things became clearer to me.  Many people may have already tripped to these observations, but for the sake of those who haven’t:

Virtualization

In much the same way that SAN and NAS virtualized storage (that is removed the users  specific knowledge of where the data was being stored)  SOC is meant to virtualize functionality.  This is really the grid angle of Service Oriented Compu.  There were a few people there who noted that unlike application servers or web servers, it will not be as easy to virtualize “stateful” services.

Service Discovery

Most of the discussion about service discovery was design time discovery.  Although there were some who felt that using the UDDI registry in an interactive mode constituted run time discovery.  There were many approaches described to aid the discovery process.

Capabilities

There was pretty widespread agreement that WSDL’s matching of signatures was not enough.  Getting beyond that was called several different things and there were several difference approaches to it.  One of the terms used was “capabilities” in other words how can we structure a spec that describes the capability of the service.  This means finding a way to describe how the state of the caller and the called objects were changed as well as noting side effects (intentional and otherwise.)

Binding

Frank Heyman from IBM made the point that WSDL is really about describing the binding  between “port types” (what the service is constructed to deal with) and specific ports (what it gets attached to).  While the default binding is of course SOAP, he had several examples, and could show that the binding was no more complex for JMS, J2EE, even CICS Comm Region binding.

Orchestration and Choreography

The tutorial clarified and subsequent presentations seemed to agree that Orchestration is what you do in machine time.  It is BPEL.  It is a unit of composition.  It is message routing primarily synchronously.  While the tools that are good for Orchestration could be used for Choreography, that’s not using each tool to its strength.

Choreography involves coordination, usually over time.  So when you have multiple organizations involved, you often have Choreography issues.  Same with having other people in the loop.  Most of what we currently think of as work flow will be subsumed into this choreography category.

Specific Talks of Note

Capabilities: Describing What Web Services Can Do –Phillipa Oaks, eta all Queensland University

This paper gets at the need to model what a service does if we are to have any hope of “discovering” them either at design time or run time.  They had a meta model that expanded the signature based description to include rules such as pre and post conditions as well as effects on items not in the signature.  It also allowed for location, time and manner of delivery constraints.

Service Based Distributed Querying on the Grid, Alpdemir, et al University of Manchester

I didn’t see this presentation, but after reading the paper wish I had.  They outline the issues involved with setting up distributed queries, and outline using the OGSA (Open Grid Service Architecture) and OGSI (Open Grid Services Infrastructure).  They got into how to set up an architecture for managing distributed queries, and then into issues such as setting up and optimizing query plans in a distributed environment.

Single Sign on for Service Based Computing, Kurt Geihs, etal Berlin University of Technology

The presentation was given by Robert Kalchlosch (one of the etal’s).  One of the best values for me was a good overview of Microsoft Passport and the Liberty Alliance, especially in regard to what cooperating services need to do to work with these standards.  This paper took the position that it may be more economical to leave services as they are and wrap them with a service/ broker that is handling the security and especially the single sign on aspect.

Semantic Structure Matching for Assessing Web-Service Similarity, Yiqiao Wang etal Univeristy of Alberta

Issues and problems in using Semantics (rdf ) is service discovery.  They noted that a simple semantic match was not of much use, but that by using Word-Net similarity coupled with structural similarity they were able to get high value matching in discovery.

“Everything Personal, not Just Business” Improving User Experience through Rule-Based Service Customization, Richard Hull, etal Bell Labs

Richard Hull wrote one of the seminal works in Semantic Modeling, so I was hoping to meet him.  Unfortunately he didn’t make it and sent a tape of his presentation instead.  The context was if people had devices that revealed their geographic location, what sort of rules would they like to set up about who they would make this information available to?  One of the things that was of interest to us was their evaluation, and then dismissal of general purpose constraint solving rule engines (like ILOG) for performance reasons.  They had some statistics and some very impressive performance on their rule evaluation.

Conclusion

The first ever Conference on Service Oriented Computing was a good one, it provided a great deal of food for thought and ideas about where this industry is headed in the medium term.

Written by Dave McComb

White Paper: How Service Oriented Architecture is Changing the Balance of Power Between Information Systems Line and Staff

As service oriented architecture (SOA) begins to become widely adopted through organizations, there will be major dislocations in the balance of power and control within IS organizations.

As service oriented architecture (SOA) begins to become widely adopted through organizations, there will be major dislocations in the balance of power and control within IS organizations. In this paper when we refer to information systems (IS) line functions, we mean those functions that are primarily aligned with the line of business systems, especially development and maintenance. When we refer to the IS staff functions, we’re referring to functions that maintain control over the shared aspects of the IS structure, such as database administration, technology implementation, networks, etc.

What is Service Oriented Architecture?

Service oriented architecture is primarily a different way to arrange the major components in an information system.  There are many technologies involved with SOA that are necessary in order to implement an SOA and we will touch on them briefly here; but the important distinction for most enterprises will be that the exemplar implementations of SOA will involve major changes in boundaries between systems and in how systems communicate.

In the past, when companies wished to integrate their applications, they either attempted to put multiple applications on a single database or wrote individual interfacing programs to connect one application to another.  The SOA approach says that all communication between applications will be done through a shared message bus and it will be done in messages that are not application-specific.  This definition is a bit extreme for some people, especially those who are just beginning their foray into SOA, but this is the end result for the companies who wish to enjoy the benefit that this new approach promises.

A message is an XML document or transaction that has been defined at the enterprise level and represents a unit of business functionality that can be exchanged between systems.  For instance, a purchase order could be expressed as an HTML document and sent between the system that originated it, such as a purchasing system, and a system that was interested in it, perhaps an inventory system.

The message bus is implemented in a set of technologies that ensure that the producers and consumers of these messages are not talking directly to each other.  The message bus mediates the communication in much the same way as the bus within a personal computer mediates communication between the various subcomponents.

The net result of these changes is that functionality can be implemented once, put on the message bus, and subsequently used by other applications.  For instance, logic that was once replicated in every application (such as production of outbound correspondence, collection on receivables, workflow routing, management of security and entitlements), as well as functionality that has not existed because of a lack of a place to put it (such as enterprise wide cross-referencing of customers and vendors), can now be implemented only once.  [Note to self: I think that sentence is not correct anymore.]  However, in order to achieve the benefits from this type of arrangement, we are going to have to make some very fundamental changes to the way responsibilities are coordinated in the building and maintaining of systems.

Web Services and SOA

Many people have confused SOA with Web services.  This is understandable as both deal with communications between applications and services over a network using XML messages.  The difference is that Web services is a technology choice; it is a protocol for the API (application programming interface).  A service oriented architecture is not a technology but an overall way of dividing up the responsibilities between applications and having them communicate.  So, while it is possible to implement an SOA using Web services technology, this is not the only option.  Many people have used message oriented middleware, enterprise application integration technologies, and message brokers to achieve the same end.  More importantly, merely implementing Web services in a default mode will not result in a service oriented architecture.  It will result in a number of point-to-point connections between applications merely using the newest technology.

Now let’s look at the organizational dynamics that are involved in building and maintaining applications within an enterprise.

The Current Balance of Power

In most IS organizations, what has evolved over the last decade or so is a balance of power between the line organizations and the staff organizations that looks something like the following.

In the beginning, the line organizations had all the budget, all the power, and all the control.  They pretty much still do.  The reason they have the budget and the power is that it’s the line organization that has been employed to solve specific business problems.  Each business problem brings with it a return on investment analysis which specifies what functionality is needed to solve a particular business problem.  Typically, each business owner or sponsor has not been very interested or motivated in spending any more money than needed to in order to solve anyone else’s problem.

However, somewhere along the line some of the central IS staff noticed that solving similar problems over and over again, arbitrarily differently, was dis-economic to the enterprise as a whole.  Through a long series of cajoling and negotiating, they have managed to wrest some control of some of the infrastructure components of the applications from the line personnel.  Typically, the conversations went something like, “I can’t believe this project went out and bought their own database management system, paid a whole bunch of money when we already have one which would’ve worked just fine!”  And through the process, the staff groups eventually wrested at least some degree of control over such things as choice of operating systems, database management systems, middleware and, in some cases, programming languages.  They also very often had a great deal of influence or at least coordination on data models, data naming standards, and the like.  So what has evolved is a sort of happy peace where the central groups can dictate the technical environment and some of the data considerations, while the application groups are free to do pretty much as they will with the scope of their application, functionality, and interfaces to other applications.

For much of the same reason, the decentralization of these decisions leads to dis-economic behavior, however, it is not quite as obvious because the corporation is not shelling out for another license for another database management system that isn’t necessary.

The New World Order

In the New World, the very things that the line function had most control of, namely the scope, functionality, and interfaces of its applications, will move into the province of the staff organization.  In order to get the economic benefit of the service oriented architecture, the main thing that has to be determined centrally for the enterprise as a whole is: what is the scope of each application and service, and what interfaces is it required to provide to others?

In most organizations, this will not go down easily.  There’s a great deal of inertia and control built up over many years with the current arrangement.  Senior IS management is going to have to realize that this change needs to take place and may well have to intervene at some fairly low levels.  As Clayton Christensen stated in his recent book The Innovator’s Solution, the strategic direction that an enterprise or department takes doesn’t matter nearly as much. What matters is whether they can get agreement from the day-to-day decision makers who are allocating resources and setting short-term goals.  For most organizations, this will require a two-pronged attack.  On one hand, the senior IS management and especially the staff function management will have to partner more closely with the business units that are sponsoring the individual projects.  Part of this partnering and working together will be in order to educate the sponsors on the economic benefits that will accrue to the applications that adhere to the architectural guidelines.  While at first this sounds like a difficult thing to convince them of, the economic benefits in most cases are quite compelling.  Not only are there benefits to be had on the individual or initial project but the real benefit for the business owner is that it can be demonstrated that this approach leads to much greater flexibility, which is ultimately what the business owner wants.  This is really a governance issue, but we need to be careful and not confuse the essence of governance, with the bureaucracy that it often entails.

The second prong of the two-pronged approach is to put a great deal of thought into how project managers and team leads are rewarded for “doing the right thing.”  In most organizations, regardless of what is said, most rewards go to the project managers who deliver the promised functionality on time and on budget.  It is up to IS management to add to these worthwhile goals equivalent goals aimed at contributing to and complying with the newer, flexible architecture, such that a project that goes off and does its own thing will be seen as a renegade and that regardless of hitting its short-term budgets, the project managers will not be given accolades but instead will be asked to try harder next time.  Each culture, of course, has to find its own way in terms of its reward structure but this is the essential issue to be dealt with.

Finally, and by a funny coincidence, the issues that were paramount to the central group, such as choice of operating system, database, programming language, and the like, are now very secondary considerations.  It’s quite conceivable that a given project or service will find that acquiring an appliance running on a completely different operating system and database management system can be far more cost-effective, even when you consider the overhead costs of managing the additional technologies.  This difference comes from two sources.  First, in many cases, the provider of the service will also provide all the administrative support for the service and its infrastructure, effectively negating any additional cost involved in managing the extra infrastructure.  Second, the service oriented architecture implementation technologies shield the rest of the enterprise from being aware of what technology, language, operating system, and DBMS are being used, so the decision does not have the secondary side effects that it does in pre-SOA architectures.

Conclusion

To wrap up, the move to service oriented architecture is not going to be a simple transition or one that can be accomplished by merely acquiring products and implementing a new architecture.  It is going to be accompanied by an inversion in the traditional control relationship between line and staff IS functions.

In the past the business units and application teams they funded determined the scope and functionality of the projects, the central IS groups determined technology and to some extent common data standards.  In the service oriented future these responsibilities will be move in the opposite direction.  The scope and functionality of projects will be an enterprise wide decision, whilst individual application teams will have more flexibility in the technologies they can economically use, and the data designs they can employ.

The primary benefits of the architecture will only accrue to those who commit to a course of action where the boundaries, functionality, and interface points of their system will no longer be delegated to the individual projects implementing them but will be determined at a corporate level ahead of time and will merely delegate the implementation to the line organization.  This migration will be resisted by many of the incumbents and the IS management that wishes to enjoy the benefits will need to prepare themselves for the investment in cultural and organizational change that will be necessary to bring it about.

White Paper: Shedding Light on the “Shared Services” Conversation

Although there are at least seven levels of granularity to “shared services,” little time has been spent to categorize these.

My observation is that although there are at least seven levels of granularity to “shared services,” little time has been spent to categorize these. Please refer to the illustration below. The degree of sharing runs a gamut from the most sharing most at the top to the least at the bottom. Mostly the higher levels of sharing imply the levels below, but that’s only most of the time, not all the time.

The colors could come in handy later to help visualize sharing by function and by agency in a large matrix. An example might help–let’s say we were trying to sort out shared services in the area of the motor pool. Let’s go through each level:

The motor pool example doesn’t quite do justice to the distinction between the application front end and the application back end, which we think may end up being the significant difference.

A larger and more traditional application may showcase that difference better. Let’s take payroll. When most people talk about HR as a shared service they are talking about sharing the application (there hasn’t been much discussion about the possibility of rebadging HR employees or relocating them) So assume we’re just talking about the HR application, there is still an extra degree of sharing to discuss; front end or back end. Traditionally when you implement a package, like SAP, most everyone affected has to learn the new application. It has new screens, new terminology, new work flow, new exceptions and new conventions. It requires new interfaces to existing systems in the field. This is why packaged implementations cost so much. The software isn’t very expensive. The literal installation and configuration doesn’t take all that much effort. It is the number and degree to which people, processes and other systems are impacted
that runs the price tags up. For most of the agencies we have been involved with, HRMS was a wrenching conversion. Many have still not recovered to their previous level of productivity. But at least one agency that we know of had a pretty easy go of it. This is because they had built an app they called HR Café. HR Café was the interface that everyone in the agency knew and used. HR Café implemented many of their local idiosyncrasies. Almost no one had direct access to the old Payroll system. So when HRMS came up, the agency just changed the interface from HR Café so that it now interacted with HRMS, and there was very little collateral damage. The back end of HRMS was shared and not the front end. In this case, the good result was sort of an inadvertent result of some other good decisions that were taken. But we think this approach can be generalized with a tremendous amount of economic benefit.

Download the White-paper

Linked Data Platform

The Linked Data Platform has achieved W3C recommendation status (which is pretty much acceptance as a standard) Linked Data Platform .  There are some good hints in LDP Primer and LDP Best Practiceslinked data .

This is the executive two paragraph treatment, to get you at least conversant with the topic.

Basically, LDP says if you treat everything like a container, and use the ldp:contains relationship to the things in the container, then the platform can treat everything consistently.  This gives us a RESTful interface onto a rdf database.  You can read from it and write to it, as long as there is a way to map your ontology to Containers and ldp:contains relationships.

Say you have a bunch of inventory related data.  You could declare that there is an Inventory container, and the connection between the Inventory Container and the Warehouses might be based on the hasStockkeeping locations.  Each Warehouse in turn could be cast as a Container and the contains relationship could point to the CatalogItems.

A promising way of getting a RESTful interface on a triple store.

What Will It Take to Build the Semantic Technology Industry?

I get asked this question a lot, and I’d like to get your help in answering it. As co-chairman of the Semantic Technology Conference, I seesemantic technology lots of customer organizations experimenting and adopting semantic technologies – especially ontology-driven development projects and semantic search tools – and seemingly as many start-ups and new products emerging to address their requirements. It’s an exciting time to be in this space and I’m glad to have a part to play. But back to the question of “what will it take?” I don’t think anyone has all the answers, though it seems there’s a growing consensus about how semantics will eventually take hold:

1. A Little Semantics Goes a Long Way

I think it was Jim Hendler who first used the expression, and I find myself in stark agreement. Much of the criticism of the semantic web vision focuses on the folly of trying to boil the ocean, yet many of the successful early adopters are getting nice results by taking small incremental steps. There’s a good exchange at Dave Beckett’s blog on this point.

2. Realistic Expectations

I guess this relates to my first point, but I remain concerned about the hype and expectations that are being set around the semantic web, and now the term Web 3.0. I, as much as anyone, would love to see the semantics field explode with growth, but this market is going to be driven by customers, not vendors, and the corporate clients I see are taking a cautious approach. I think they’ll catch on eventually, but let’s not try to push them too far, too fast.

3. We Don’t Need a Killer App

Personally I think we need to look at semantic capabilities as an increasing component of the web and computing infrastructure, as opposed than trying to identify a killer app that’s going to kickstart a buying frenzy. If a killer app emerges then that’s great, but don’t hold your breath. There’s plenty of value to be gained in the meantime. More than anything, we need to demonstrate speedy, cheap ways to get started with semantics. This will be far more useful in the long run.

4. We Need to Get Business Mindshare

It’s so obvious that I’m almost embarrassed to say it, but the main point is that we need to improve how we’re currently demonstrating the business value of semantic technology. I see a few key ways we can improve, starting with a greater willingness to talk about the projects already taking place. Secondly, I think we can leverage existing technology trends – especially SOA and mashups – to show how semantic technology can add value to these efforts. Third, and I might risk offending some people with this, but in the short term we should be emphasizing cost savings and reduced time to deployment over and above the extra intelligence and functionality that semantics can provide. Especially for corporate customers. Semantic SOA can save hugely over conventional approaches in data integration and interface projects, and this is where most businesses are really feeling the pain right now. This is a short and probably incomplete list of ideas. There’s more at the Semantic Technology Conference.