Industry Knowledge Graph Case Study

November 5, 2025 by Michael Atkin

Industry Knowledge Graph Case Study

Written by Michael Atkin

In 1978 the owner of the New England Patriots made a famous commercial after his wife gave him a Remmington electric razor – “I liked the razor so much, I bought the company.” In the case of the Industry Building Blocks (IBB), Semantic Arts liked the content so much, they turned it into a knowledge graph and became a full business partner with IBB.

The Industry Knowledge Graph (IndustryKG) is the perfect case study for people to understand (and visualize) the power of a knowledge graph for any organization. It is based on semantic standards – which are mature, sustainable and well governed. It consists of valuable content that has been transformed into a granular industry classification system around the intersection of companies, industries and products across the global economy. It has a graphical user interface that is easy for non-technical analysts to manipulate and do comparisons across a broad array of criteria. And it enables both value-based pricing and entitlement control to protect the intellectual property and ensure the continuity of recuring revenue.

Let’s take a look at the content first. IndustryKG contains 24,751 (and growing) industries consisting of 19 supersectors, 141 sectors, 619 subsectors and 3400 industry groupings. It was developed based on three decades of corporate planning expertise that defines industries at the granular level to help analysts understand the lines of business of the parent organization as well as the competitive industry landscape of each – including product substitutes and complements. The data has been augmented to include industry attractiveness rankings, buyer purchase criteria, product-related industry trends and industry growth forecasts.

The entire IndustryKG model of the global economy has been organized into a granular industry classification system based on Michael Porter’s five forces framework. Porter’s work has long been used as a tool for analyzing industry competitiveness and profitability by examining: (1) rivalry among existing competitors, (2) threat of new entrants, (3) threat of substitute products, (4) bargaining power of buyers, and (5) bargaining power of suppliers. Compare this to traditional classification systems (i.e. NAICS, GICS, SIC) that many deem as too course to appreciate true head-to-head competition. In our opinion, this rare and unique content is a natural candidate to be expressed as a knowledge graph.

The business challenge that IBB faced should be familiar to many people. The IBB content was embedded into spreadsheets and PDF files. It was data that was hard to manipulate for scenario-based analysis and for connecting the dots from industry – to company – to line of business – to competitors. Users couldn’t easily combine criteria to follow their business intuition. In addition, IBB wanted to protect the intellectual property and implement a flexible subscription-based business model. The product had to be easy to use by corporate planners, market researchers, M&A analysts, investment strategists and those involved in supply chain management, joint venture discovery and cost containment opportunities.

Users wanted to compare direct substitutions across industries, look at similar value chains and do side-by-side comparisons on a granular industry basis. All of these business objectives are solved with data-centric architecture that puts the data and the model at the core of the system and keeps the design as simple as possible to ensure that the content remains both reusable and extensible.

Semantic Arts was responsible for the data transformation strategy. We recognized that core to the IndustryKG is the industry classification system – which is designed to be a logical representation of factual truth (i.e. the products and services at the base of all industries). Ontologists working on the project were able to conceptualize the definition of an “industry” as defined by Porter’s five forces framework. This was accomplished by extracting the right information from the subject matter experts to understand the complexity of relationships and how to translate them as ontology. Translating the subject matter expert (SME) knowledge is one of the more important ‘create and align’ tasks of the ontologists.

The logical structure of the original IBB – the existing spreadsheets of companies, industries and products is a natural ontology. The primary goal was to align the spreadsheet contents from CSV to triples and ensure that all the concepts were harmonized to their basic building blocks using gist (the open-sourced foundational ontology created by Semantic Arts) as an accelerator. All of the concepts needed for the expression of business and product relationships were aligned to the classes and properties of gist. For example, the concept of a “company” in the original IBB product aligns to the class of “organization” in gist. New properties such as ‘has market capitalization’ were created and market capitalization ‘has value’ of a ‘monetary amount’ in gist. This is the true expression of “meaning, not words” that characterizes conceptual modeling.

The pipeline management process was fairly straightforward. Our developers decrypted the password protection before converting the files to CSVs. The CSVs were processed to flatten the data and perform preliminary data validation (i.e. data types, values, expected ranges). This content was transformed into triples and uploaded into a triplestore repository to obtain new counts and calculate revenue and percentages at the group and industry level. The final data validation was performed using SHACL.

Semantic Arts assigned a number of developers onto the project to do design and layout. This was crucial because the user interface had to be ideal for business executives – in that data can be hard to use with lots of reconciliation or easy to use with the right tools. The design was also an iterative process between designers, developers and consumers for usability. We can’t emphasize enough the importance of engagement with end users to enable them to analyze, contrast and look at viewpoints from multiple perspectives. Our advice is to invest in the model from the beginning to ensure that business users can ask business questions and get business answers.

The business side of the equation was about implementing the subscription model and token purchase capability that IBB used as its pricing model. This included defining the shapes for the business model (i.e. users, accounts, organizations, addresses, identity, levels of permissioning, tokens purchased, number of facts consumed, authentication) to allow for payment according to depth, level and value of the content. This is a capability that is transferable to the full spectrum of entitlement requirements that exist for many industries.

Semantic Arts and IBB invite you to test drive the Industry Knowledge Graph for yourself. Click here to watch a video on how it works. Click here to get a free hands-on demo. Semantic Arts has championed the data-centric approach to managing enterprise data since its inception. We have been perfecting the methodology for data transformation but have never had any data of our own or a platform to showcase the possibility. The Industry Knowledge Graph is that platform. If you have a Knowledge Graph initiative and would like to populate it with industry and business intelligence information, we can provide knowledge graph triples. If you don’t yet have a Knowledge Graph project, we have an excellent starting point.

gistBFO: An Open-Source, BFO Compatible Version of gist

October 16, 2025September 3, 2025 by Semantic Arts Admin

gistBFO: An Open-Source, BFO Compatible Version of gist

Dylan ABNEY ^a,1, Katherine STUDZINSKI ^a, Giacomo DE COLLE ^b,c, Finn WILSON ^b,c, Federico DONATO ^b,c, and John BEVERLEY ^b,c ^aSemantic Arts, Inc.

^bUniversity at Buffalo

^cNational Center for Ontological Research

ORCiD ID: Dylan Abney https://orcid.org/0009-0005-4832-2900, Katherine Studzinski https://orcid.org/0009-0001-3933-0643, Giacomo De Colle https://orcid.org/0000- 0002-3600-6506, Finn Wilson https://orcid.org/0009-0002-7282-0836, Federico Donato https://orcid.org/0009-0001-6600-240X, John Beverley https://orcid.org/0000- 0002-1118-1738

Abstract. gist is an open-source, business-focused ontology actively developed by Semantic Arts. Its lightweight design and use of everyday terminology has made it a useful tool for kickstarting domain ontology development in a range of areas including finance, government, and pharmaceuticals. The Basic Formal Ontology (BFO) is an ISO/IEC standard upper ontology that has similarly found practical application across a variety of domains, especially biomedicine and defense. Given its demonstrated utility, BFO was recently adopted as a baseline standard in the U.S. Department of Defense and Intelligence Community.

Because BFO sits at a higher level of abstraction than gist, we see an opportunity to align gist with BFO and get the benefits of both: one can kickstart domain ontology development with gist, all the while maintaining an alignment with the BFO standard. This paper presents such an alignment, which consists primarily of subclass relations from gist classes to BFO classes and includes some subproperty axioms. The union of gist, BFO, and this alignment is what we call “gistBFO.” The upshot is that one can model instance data using gist and then instances of gist classes can be mapped to BFO. This not only achieves compliance with the BFO standard; it also enables interoperability with other domains already modelled using BFO. We describe a methodology for aligning gist and BFO, provide rationale for decisions we made about mappings, and detail a vision for future development.

Keywords. Ontology, upper ontology, ontology alignment, gist, BFO

1. Introduction

In this paper, we present an alignment between two upper ontologies: gist and the Basic Formal Ontology (BFO). While both are upper ontologies, gist and BFO exhibit rather different formal structures. An alignment between these ontologies allows users to get the benefits of both.

An ontology is a representational artifact which includes a hierarchy of classes of entities and logical relations between them [1, p.1]. Ontologies are increasingly being used to integrate diverse sorts of data owing to their emphasis on representing implicit semantics buried within and across data sets, in the form of classes and logical relations among them [2]. Such formal representations facilitate semantic interoperability, where diverse data is connected by a common semantic layer. Ontologies have additionally proven valuable for clarifying the meanings of terms [3] and supporting advanced reasoning when combined with data, in the form of knowledge graphs [4].

The Basic Formal Ontology (BFO) is an upper-level ontology that is used by over 700 open-source ontologies [5]. It is designed to be very small, currently consisting only of 36 classes, 40 object properties, and 602 axioms [6]. BFO satisfies the conditions for counting as a top-level ontology, described in ISO/IEC 21838-1:2021: it is “…created to represent the categories…shared across a maximally broad range of domains” [7]. ISO/IEC 21838-2:2021 establishes BFO as a top-level ontology standard [8]. The BFO ecosystem adopts a hub-and-spokes strategy for ontology extensions, where classes in BFO form a hub, and new subclasses of BFO classes are made as spokes branching out from it. Interoperability between different ontologies can be preserved by linking up to BFO as a common hub. All classes in BFO are subclasses of bfo:Entity2 [9], which includes everything that has, does, or will exist. Within this scope, BFO draws a fundamental distinction with two classes: bfo:Continuant and bfo:Occurrent. Roughly, a continuant is a thing that persists over some amount of time, whereas an occurrent is something that happens over time [1]. A chef is an example of a continuant, and an act of cooking is an example of an occurrent.

gist is a business-focused upper-level ontology that has been developed over the last 15+ years and used in over 100 commercial implementations [10]. Ontology elements found in gist leverage everyday labels in the interest of facilitating stakeholder understanding to support rapid modeling. Much like BFO, gist contains a relatively small number of terms, relations, and formally specified axioms: It has 98 classes, 63 object properties, 50 datatype properties, and approximately 1400 axioms, at the time of this writing. Approximately 20 classes are at the highest level of the gist class hierarchy. Subclasses are defined using a distinctionary pattern,3 which includes using a subclass axiom along with disjointness axioms and property restrictions to distinguish a class from its parents and siblings. gist favors property restrictions over domain and range axioms to maintain generality and avoid a proliferation of properties [12]. Commonly used top level classes include gist:Commitment, gist:Event, gist:Organization, gist:PhysicalIdentifiableItem, and gist:Place.

Ontology alignments in general are useful because they allow interoperability between ontologies and consequently help prevent what has been called the ontology silo problem, which arises when ontologies covering the same domain are constructed independently from one another, using differing syntax and semantics [13]. Ontologists typically leverage the Resource Description Framework (RDF) and vocabularies extended from it, to maintain flexibility when storing data into graphs, which goes some way to address silo problems. If, however, data is represented in RDF using different ontologies, enriched with different semantics, then ontology silo problems emerge. Alignment between potential ontology silos can address this problem by allowing the data to be interpreted by each aligned ontology.

Needless to say, given the respective scopes of gist and BFO, as well as their overlapping users and domains, we have identified them as ontology silos worth aligning. For current users of gist, alignment provides a way to leverage BFO without requiring

²We adopt the convention of displaying class names in bold, prepended with a namespace identifier indicating provenance. ³The distinctionary pattern outlined in [11] is like the Aristotelian approach described in [1].

any additional implementation. For new users of gist, it provides a pragmatic base for building domain ontologies. This is of particular importance as BFO was recently adopted as a baseline standard in the U.S. Department of Defense and Intelligence Community [14]. For stakeholders in both the open and closed space, the alignment proposed here will allow users to model a domain in gist and align with other ontologies in the BFO ecosystem, satisfying the requirements of leveraging an ISO standard. In the other direction, users of BFO will be able to leverage domain representations in gist, gaining insights into novel modeling patterns, potential gaps in the ecosystem, and avenues for future modeling research.

2. Methodology

In this section we discuss the process we used to build what we call “gistBFO,” an ontology containing a semantic alignment between gist and BFO. We started by creating an RDF turtle file that would eventually contain all the mappings, and then manually worked out the connections between gist and BFO starting from the upper-level classes of both ontologies. We specified that our new ontology imports both the gist and BFO ontologies, complete with their respective terms and axioms. To make use of gistBFO, it can be imported into a domain ontology that currently uses gist.

Figure 1. gistBFO import hierarchy⁴

2.1. Design principles

To describe our methodology, it is helpful to distinguish between alignments and mappings [15]. By alignment we mean a set of assertions (or “triples”) of the form <s, p, o> that relate the terms of one ontology to another. gistBFO contains such an alignment. The individual assertions contained within an alignment are mappings.

⁴This diagram is adapted from a similar diagram in [10].

gist:Specification subClassOf bfo:GenericallyDependentContinuant⁵(bfo:GDC, hereafter) is an example of one mapping in gistBFO’s alignment [16]. By way of evaluation, we have designed gistBFO to exhibit a number of important properties: consistency, coherence, conservativity, specificity, and faithfulness [17]. An ontology containing an alignment is consistent just in case its mappings and the component ontologies do not logically entail a contradiction. For example, if a set of assertions entails both that Bird is equivalent to NonBird and that NonBird is equivalent to the complement of Bird, then it is inconsistent. Relatedly, such an ontology is coherent just in case all of its classes are satisfiable. In designing an ontology, a common mistake is creating an unsatisfiable class—a class that cannot have members on pain of a contradiction.6 Suppose a class A is defined as a subclass of both B and the complement of B. Anything asserted as a member of A would be inferred to be a member of B and its complement, resulting in a contradiction. Note that the definition of A itself does not result in a logical inconsistency; it is only when an instance is asserted to be a member of A that a contradiction is generated.

Consistency and coherence apply to gistBFO as a whole (i.e., the union of gist, BFO, and the alignment between them). The next several apply more specifically to the alignment.

An alignment is conservative just in case it does not add any logical entailments within the aligned ontologies.⁷Trivially, gistBFO allows more to be inferred than either gist or BFO alone, since it combines the two ontologies and adds mapping assertions between them. However, it should not imply anything new within gist or BFO, which would effectively change the meanings of terms within the ontologies. For example, gist never claims that gist:Content subClassOf gist:Event. If gistBFO were to imply this, it would not only be ontologically suspect, but it would extend gist in a non-conservative manner, effectively changing the meaning of gist:Content. Similarly, BFO never claims that BFO:GDC subClassOf BFO:Process (again, for good reason); so if gistBFO were to imply this, this too would make it a non-conservative extension, changing the content of BFO itself. It is desirable for the alignment to be a conservative extension of gist and BFO so that it is not changing the meaning of terms within gist or BFO. By the same token, if gistBFO were to remove axioms from gist or BFO, it would need to be handled carefully so that it too preserves the spirit of the two ontologies. (More on this in Section 4.1.1.) Additionally, if gistBFO does not remove any axioms from gist or BFO, there is no need to maintain separate artifacts with modified axioms.

An alignment is specific to the extent that terms from the ontologies are related to the most specific terms possible. For example, one possible alignment between gist and BFO would contain mappings from each top-level gist class to bfo:Entity. While this would constitute a bonafide alignment by our characterization above, it is not an interesting or useful one. If it achieves BFO-compliance, it is only in a trivial sense. For this reason, we aimed to be specific with our alignment and mapped gist classes to the lowest BFO classes that were appropriate.

⁵Strictly speaking, the IRI for generically dependent continuant in BFO is obo:BFO_0000031, but we use bfo:GenericallyDependentContinuant (and bfo:GDC for short). The actual subclass relation used in the alignment is rdfs:subClassOf, but the namespace prefix is dropped for brevity. ⁶In OWL, unsatisfiable classes are subclasses of owl:Nothing, the class containing no members. It is analogous to the set-theoretic notion of “the empty set.” ⁷See [17, p.3] for a more formal explanation of conservativity in the context of an alignment.

An alignment is faithful to the extent that it respects the intended meanings of the terms in each ontology. Intent is not always obvious, but it can often be gleaned from formal definitions, informal definitions/annotations, and external sources.

We aim in this work for gistBFO to exhibit the above properties. Note also that two ontologies are said to be synonymous just in case anything expressed in one ontology can be expressed in terms of the other (and vice versa) [18]. We do not attempt to establish synonymy with this alignment. First, for present purposes, our strategy is to model in gist and then move to BFO, not the other way around. Second, the alignment in its current form consists primarily of subclass assertions from gist classes to BFO classes. With an accurate subclassing bridge, instances modeled in gist would then achieve an initial level of BFO-compliance, as instances can be inferred into BFO classes. A richer mapping might be able to take an instance modeled in gist and then translate that entirely into BFO, preserving as much meaning as possible. For example, something modeled as a gist:Event with gist:startDateTime and gist:endDateTime might be modeled as a bfo:Process related to a bfo:TemporalRegion. We gesture at some more of these richer mappings in the Conclusion section, noting that our ultimate plan is to investigate these richer mappings in the future. So, while we do not attempt to establish synonymy between gist and BFO at present, we do have a goal of preserving as much meaning as possible in the alignment here, and plan to expand this work in the near future. In that respect, our work here provides a firm foundation for a richer, more complex, semantic alignment between gist and BFO.

Given our aim of creating a BFO-compliant version of gist, we have created a consistent, coherent, conservative, specific, and faithful ontology. Since both gist and BFO are represented in the OWL DL profile, consistency and coherence were established using HermiT, a DL reasoner [19, 20]. By running the reasoner, we were able to establish that no logical inconsistencies or unsatisfiable classes were generated. While it is undecidable in OWL 2 DL whether an alignment is a conservative extension, one can evaluate the approximate deductive difference by looking more specifically at the subsumption relations that hold between named classes in gist or BFO.8 We checked, for example, that no new entailments between gist classes were introduced. Specificity and faithfulness are not as easily measured, but we detail relevant design choices in the Discussion section as justification for believing our alignment exhibits these properties as well.

2.2. Identifying the mappings

The properties detailed in Section 2.1 give a sense of our methodological aims for gistBFO. Now we turn to our methods for creating the mappings within the alignment. In our initial development of the alignment, we leveraged the BFO Classifier [22]. Included in the BFO Classifier was a decision diagram that allowed us to take instances of gist classes, answer simple questions, and arrive at a highly plausible candidate superclass in BFO. For example, consider a blueprint for a home. In gist, a blueprint would fall under gist:Specification. To see where a blueprint might fall in BFO, we answered the following questions:

⁸The set of changed subsumption entailments from combining ontologies with mappings has been called the approximate deductive difference [17, p.3; 21].

• Q: Does this entity persist in time or unfold in time? A: It persists. So, a blueprint is a bfo:Continuant.

• Q: Is this entity a property of another entity or depends on at least one other entity? A: Yes, a blueprint depends on another entity (e.g., a sheet of paper) to be represented.

• Q: May the entity be copied between a number of bearers? A: Yes, a blueprint can be copied across multiple sheets of paper. So, a blueprint is a bfo:GDC.

Given that blueprints are members of gist:Specification and bfo:GDC (at least according to our answer above), bfo:GDC was considered a plausible candidate superclass for gist:Specification. And indeed, as we think about all the possible instances of gist:Specification, they all seem like they would fall under bfo:GDC.

Our alignment was not conducted entirely by using the BFO Classifier. Our teams are constituted by lead developers, stakeholders, and users of both gist and BFO. Classification was refined through consensus-driven meetings, where the meanings of ontology elements in respective structures were discussed, debated, and clarified. Thus, while the BFO Classifier tool provided a very helpful starting point for discussions of alignment, thoughtful effort was put into identifying and verifying that the gist and BFO mappings exhibited the highest degree of accuracy.

Tables 1 and 2 contain a non-exhaustive list of important classes and definitions from gist and BFO that we refer to throughout the paper.

BFO Class Elucidation/Definition
Continuant An entity that persists, endures, or continues to exist through time while maintaining
its identity.

Independent
Continuant A continuant which is such that there is no x such that it specifically depends on x
and no y such that it generically depends on y.

Specifically
Dependent
Continuant A continuant which is such that (i) there is some independent continuant x that is not
a spatial region, and which (ii) specifically depends on x.

Generically
Dependent
Continuant An entity that exists in virtue of the fact that there is at least one of what may be
multiple copies.

Material Entity An independent continuant that at all times at which it exists has some portion of
matter as continuant part.

Immaterial Entity An independent continuant which is such that there is no time t when it has a
material entity as continuant part.

Object A material entity which manifests causal unity and is of a type instances of which
are maximal relative to the sort of causal unity manifested.

Occurrent An entity that unfolds itself in time or is the start or end of such an entity or is a
temporal or spatiotemporal region.

Process An occurrent that has some temporal proper part and for some time has a material
entity as participant.
Table 1. Selected BFO classes and definitions [6]

gist Class Elucidation/Definition
Event Something that occurs over a period of time, often characterized as an activity being
carried out by some person, organization, or software application or brought about
by natural forces.

Organization A generic organization that can be formal or informal, legal or non-legal. It can have
members, or not.

Building A relatively permanent man-made structure situated on a plot of land, having a roof
and walls, commonly used for dwelling, entertaining, or working.

Unit of Measure A standard amount used to measure or specify things.

Physical
Identifiable Item A discrete physical object which, if subdivided, will result in parts that are
distinguishable in nature from the whole and in general also from the other parts.

Specification One or more characteristics that specify what it means to be a particular type of
thing, such as a material, product, service or event. A specification is sufficiently
precise to allow evaluating conformance to the specification.

Intention Goal, desire, aspiration. This is the “teleological” aspect of the system that indicates
things are done with a purpose.

Temporal Relation A relationship existing for a period of time.

Category A concept or label used to categorize other instances without specifying any formal
semantics. Things that can be thought of as types are often categories.

Collection A grouping of things.

Is Categorized By Points to a taxonomy item or other less formally defined class.

Is Member Of Relates a member individual to the thing, such as a collection or organization, that it
is a member of.
Table 2. Selected gist classes and definitions [23]

3. Results

The gistBFO alignment contains 43 logical axioms. 35 of these axioms are subclass assertions relating gist classes to more general classes in BFO. All gist classes have a superclass in BFO.⁹The remaining eight axioms are subproperty assertions. We focused on mapping key properties in gist (e.g., gist:isCategorizedBy and gist:isMemberOf) to BFO properties. While mapping gist properties to more specific properties in BFO does not serve the use case of starting with gist and inferring into BFO, it nevertheless provides a richer connection between the ontologies, which we view as a worthy goal.

In addition to these 43 logical axioms, gistBFO also contains annotations expressing the rationale behind some of the mapping choices. We created an annotation property gist:bfoMappingNote for this purpose.

At the highest level, almost all classes in gist fall under bfo:Continuant, since their instances are things that persist through time rather than unfold over time. Exceptions to this are instances falling under gist:Event and its subclasses, which (generally) fall under bfo:Occurrent.

Some of the gist subclasses of bfo:Continuant include gist:Collection, gist:PhysicallyIdentifiableItem, and gist:Content. Within BFO, continuants break down into bfo:IndependentContinuant (entities that bear properties), bfo:GDC (copyable patterns that are often about other entities), and bfo:SDC (properties borne by independent continuants). With respect to our alignment, introduced subclasses of bfo:IndependentContinuant include gist:Building or gist:Component or other material entities like gist:PhysicalSubstance.¹⁰Subclasses of bfo:GDC include gist:Content, gist:Language, gist:Specification, gist:UnitOfMeasure, and

⁹An exception is gist:Artifact, which, in addition to being difficult to place in BFO, is slated for removal from gist. ¹⁰Best practice in BFO is to avoid mass terms [1], whereas gist:PhysicalSubstance is intentionally designed to represent them—e.g., a particular amount of sand. Regardless, this class of mass terms would map into a subclass of bfo:IndependentContinuant.

gist:Template—all things that can be copied across multiple bearers.¹¹A subclass of bfo:SDC includes gist:TemporalRelation—a relational quality holding between multiple entities.

In most cases, the subclass assertions are simple in construction, relating a named class in gist to a named class in BFO, for example, gist:Specification subClassOf bfo:GDC. A more complex pattern involves the use of OWL property restrictions. For example, gist:ControlledVocabulary was asserted to be a subclass of bfo:GDCs that have some bfo:GDC as a continuant part.

gist:ControlledVocabulary

rdfs:subClassOf [

a owl:Class ;

owl:intersectionOf (

# class = bfo:GDC

obo:BFO_0000031

[

a owl:Restriction ;

# property = bfo:hasContinunantPart owl:onProperty obo:BFO_0000178 ; # class = bfo:GDC

owl:someValuesFrom obo:BFO_0000031 ;

]

) ; ] ; .

In other cases, we employed a union pattern—e.g., gist:Intention is a subclass of the union of bfo:SDC and bfo:GDC. Had we chosen a single named superclass in BFO for gist:Intention, it might have been bfo:Continuant. The union pattern, however, allows our mapping to exhibit greater specificity, as discussed above.

Figures 2 through 4 illustrate important subclass relationships between gist and BFO classes:

Figure 2. Continuants in gist

¹¹Many of these can be understood as various sorts of ‘information’, which should be classified under bfo:GDC. For example, units of measurement are standardized information which describe some magnitude of quantity.

Figure 3. Independent and dependent continuants in gist

Figure 4. gist:Event

4. Discussion

In this section we discuss in depth some specific mappings we made, focusing most closely on some challenging cases.

4.1.1. gist:Intention and gist:Specification

One challenging case was gist:Intention and its subclass gist:Specification. The textual definition of gist:Intention suggests it is a mental state that is plausibly placed under bfo:SDC. That said, the textual definition of gist:Specification (think of a blueprint) suggests this class plausibly falls under bfo:GDC. Given that bfo:SDC and bfo:GDC

are disjoint in BFO, this would result in a logical inconsistency. We thus appear to have encountered a genuine logical challenge to our mapping.

Exploring strategies for continuing our effort, we considered importing a “relaxed” version of BFO that drops the disjointness axiom between bfo:SDC and bfo:GDC. Arguably this option would respect the spirit of gist (by placing gist:Intention and gist:Specification in their true homes in BFO) while losing a bit of the spirit of BFO. While this may appear to be an unsatisfactory mapping strategy, we maintain that—if such relaxing of constraints are properly documented and tracked—there is considerable benefit in adopting such a strategy. Given two ontologies developed independently of one another, there are likely genuine semantic differences between them, differences that cannot be adequately addressed by simply adopting different labels. Clarifying, as much as possible, what those differences are can be incredibly valuable when representing data using each ontology structure. Putting this another way, if, say, gist and BFO exhibited some 1-1 semantic mapping so that everything in gist corresponds to something in BFO and vice versa, it would follow that the languages of gist and BFO were simply two

different ways to talk about the same domain. We find this less interesting, to be candid, than formalizing the semantic overlap between these structures, and noting precisely where they semantically differ. One way in which such differences might be recorded is by observing and documenting—as suggested in this option—where logical constraints such as disjointness might need to be relaxed in alignment.

The preceding stated, relaxing constraints should be the last, not the first, option pursued, since for the benefits highlighted above to manifest, it is incumbent on us to identify where exactly there is semantic alignment, and formalize this as clearly as possible. With that in mind, we pursue another option here, namely, to use a disjunctive definition for gist:Intention—asserted to be a subclass of the union of bfo:GDC and bfo:SDC. While this disjunctive definition perhaps does not square perfectly with the text definition of gist:Intention, it does seem to be in the spirit of how gist:Intention is actually used—sometimes like an bfo:SDC (in the case of a gist:Function), sometimes like a bfo:GDC (in the case of a gist:Specification). This option does not require a modified version of BFO. It also aligns with our goal of exhibiting specificity in our mapping, since otherwise we would have been inclined to assert gist:Intention to simply be a subclass of bfo:Continuant.

gist:Intention

rdfs:subClassOf [

a owl:Class ;

owl:unionOf (

obo:BFO_0000020 # bfo:SDC

obo:BFO_0000031 # bfo:GDC

);];.

This mapping arguably captures the spirit of both gist and BFO while remaining conservative—i.e., it does not change any of the logical entailments within gist or BFO.

4.1.2. gist:Organization

gist:Organization was another interesting case. During the mapping we consulted the Common Core Ontologies (CCO), a suite of mid-level ontologies extended from BFO, for guidance since it includes an organization class [24]. cco:Organization falls under bfo:ObjectAggregate. Arguably, however, organizations can be understood as something over and above the aggregate of their members, perhaps even persisting when there are no members. For this reason, we considered bfo:ImmaterialEntity and bfo:GDC as superclasses of gist:Organization. On the one hand, the challenge with asserting gist:Organization is a subclass of bfo:ImmaterialEntity is that instances of the latter cannot have material parts, and yet organizations often do, i.e. members. On the other hand, there is plausibly a sense in which organizations can be understood as, say, prescriptions or directions (bfo:GDC) for how members occupying positions in that organization should behave, whether there ever are actual members. The CCO characterization of organization does not seem to reflect this sense, given it is defined in terms of members. It was thus important for our team to clarify which sense, if either or both, was best reflected in gist:Organization.

Ultimately, we opted for asserting bfo:ObjectAggregate as the superclass for gist:Organization, as the predominant sense in which the latter is to be understood concerns members of such entities. This is, importantly, not to say there are not genuine

alternative senses of organization worth modeling in both gist and within the BFO ecosystem; rather, it is to say after reflection, the sense most clearly at play here for gist:Organization involves membership. For some gist classes, annotations and examples made it clear that they belonged under a certain BFO class. In the case of gist:Organization, gist is arguably neutral with respect to a few candidate superclasses. Typically what is most important in an enterprise context is modeling organizational structure (with sub-organizations) and organization membership. Perhaps this alone does not require gist:Organization being understood as a bfo:ObjectAggregate; nevertheless, practical considerations pointed in favor of it. Adopting this subclassing has the benefit of consistency with CCO (and a fortiori BFO) and allows for easy modeling of organization membership in terms of BFO.

4.1.3. gist:Event

At a first pass, a natural superclass (or even equivalent class) for gist:Event is bfo:Process. After all, ‘event’ is an alternative label for bfo:Process in BFO. Upon further evaluation, it became clear that some instances of gist:Event would not be instances of bfo:Process—namely, future events. In BFO, with its realist interpretation, processes have happened if they are to be represented. It is in this way that BFO differentiates how the world could be, e.g., this portion of sodium chloride could dissolve, from how the world is, e.g., this portion of sodium chloride dissolves. Future events can be modeled as specifications, ultimately falling under bfo:GDC. In contrast, a subclass of gist:Event, namely gist:ScheduledEvent, includes within its scope events that have not yet started. There is thus not a straightforward mapping between bfo:Process and gist:Event. Following our more conservative strategy, however, the identified discrepancy can be accommodated by asserting that gist:Event is a defined subclass of the union of bfo:GDC and bfo:Process.¹²In this respect, we are able to represent instances of gist:Event that have started (as instances of bfo:Process) and those that have not (as instances of bfo:GDC).

gist:Event

rdfs:subClassOf [

a owl:Class ;

owl:unionOf (

obo:BFO_0000031 # bfo:GDC

obo:BFO_0000015 # bfo:Process

);];.

4.1.4. gist:Category

gist:Category is a commonly used class in gist. It allows one to categorize an entity without introducing a new class into an ontology. It guards against the proliferation of classes with little or no semantics; instead, such categories are treated as instances, which

¹²It is common in gist to model planned-event-turned-actual-events as single entities that persist through both stages. When a plan goes from being merely planned to actually starting, it can flip from a bfo:GDC to a bfo:Process. Events that have a gist:actualStartDateTime will be instances of bfo:Process, and the presence of this property could be used to automate the flip. Different subclasses of gist:Event will be handled differently—e.g., gist:HistoricalEvent is a subclass of bfo:Process that would not require the transition from bfo:GDC.

are related to entities by a predicate gist:isCategorizedBy. So, for example, one might have an assertion like ex:_Car_1 gist:isCategorizedBy ex:_TransmissionType_manual, where the object of this triple is an instance of ex:TransmissionType, which would be a subclass of gist:Category.

If one thinks of BFO as an ontology of particulars, and if instances of gist:Category are not particulars but instead types of things, then arguably gist:Category does not have a home in BFO.

Nevertheless, as a commonly-used class in gist, it is helpful to find a place for it in BFO if possible. One option is bfo:SDC: indeed, there are some classes in CCO (e.g., cco:EyeColor) that seem like they could be subclasses of gist:Category. However, instances of bfo:SDC (e.g., qualities and dispositions) are individuated by the things that bear them (e.g., the eye color of a particular person), which does not seem to capture the spirit of gist:Category. Ultimately, we opted for bfo:GDC as the superclass in part because of the similarity of instances of gist:Category to information content entities in CCO, which are bfo:GDCs.

5. Conclusion

5.1. Future work

We have established a foundational mapping between gist and BFO. From this foundation going forward we aim to improve gistBFO along multiple dimensions. The first set of improvements relate to faithfulness. While we are confident in many of the mappings we have made, we expect the alignment to become more and more accurate as we continue development. In some cases, the intended meanings of concepts are obvious from formal definitions and annotations. In other cases, intended meaning is best understood by discussions about how the concepts are used in practice. As we continue discussions with practitioners of gist and BFO, the alignment will continue to improve.

Another aim related to faithfulness is to identify richer mappings. In its current form gistBFO allows instance data modeled under gist to be inferred into BFO superclasses. While this achieves an initial connection with BFO, a deeper mapping could take something modeled in gist and translate it to BFO. Revisiting the previous example, something modeled as a gist:Event with gist:startDateTime and gist:endDateTime

might be modeled as a bfo:Process related to a bfo:TemporalRegion. Many of these types of modeling patterns can be gleaned from formal definitions and annotations, but they do not always tell the whole story. Again, this is a place where continued discussions with practitioners of both ontologies can help. From a practical perspective, more complex mappings like these could be developed using a rule language (e.g., datalog or SWRL) or SPARQL INSERT queries.

We have also considered alignment with the Common Core Ontologies (CCO). One of the challenges with this alignment is that gist and CCO sit at similar levels of abstraction. Indeed, gist and CCO even appear to share classes that exhibit overlapping semantics, e.g., language and organization. The similar level of abstraction creates a challenge because it is not always easy to determine which classes are more general than which. For example, are gist:Organization and cco:Organization equivalent, or is one a superclass of the other? Furthermore, because there are considerably more classes in CCO than BFO, preserving consistency with a growing set of alignment axioms becomes

more of a concern. Despite the challenges, a mapping between gist and CCO would help with interoperability, and it is a topic we intend to pursue in the future to that end.

5.2. Final remarks

We have presented an open-source alignment between gist and BFO. We described a methodology for identifying mappings, provided rationale for the mappings we made, and outlined a vision for future development. Our hope is that gistBFO can serve as a practical tool, promoting easier domain ontology development and enabling interoperability.

Acknowledgements

Thank you to Dave McComb for support at various stages of the gistBFO design process, from big-picture discussions to input on specific mappings. Thanks also to Michael Uschold and Ryan Hohimer for helpful discussions about gistBFO.

References

[1] Arp R, Smith B, Spear AD. Building ontologies with Basic Formal Ontology. Cambridge, Massachusetts: The MIT Press; 2015. p. 220

[2] Hoehndorf R, Schofield PN, Gkoutos GV. The role of ontologies in biological and biomedical research: a functional perspective. Brief Bioinform. 2015 Nov;16(6):1069–80, doi: 10.1093/bib/bbv011 [3] Neuhaus F, Hastings J. Ontology development is consensus creation, not (merely) representation. Applied Ontology. 2022;17(4):495-513, doi:10.3233/AO-220273

[4] Chen X, Jia S, Xiang Y. A review: Knowledge reasoning over knowledge graph. Expert Systems with Applications. 2020 Mar;141:112948, doi:10.1016/j.eswa.2019.112948

[5] Basic Formal Ontology Users [Internet]. Available from: https://basic-formal-ontology.org/users.html [6] GitHub [Internet]. Basic Formal Ontology (BFO) Wiki – Home. Available from: https://github.com/BFO ontology/BFO/wiki/Home

[7] ISO/IEC 21838-1:2021: Information technology — Top-level ontologies (TLO) Part 1: Requirements [Internet]. Available from: https://www.iso.org/standard/71954.html

[8] ISO/IEC 21838-2:2021: Information technology — Top-level ontologies (TLO) Part 2: Basic Formal Ontology (BFO) [Internet]. Available from: https://www.iso.org/standard/74572.html [9] Otte J, Beverley J, Ruttenberg A. Basic Formal Ontology: Case Studies. Applied Ontology. 2021 Aug;17(1): doi:10.3233/AO-220262

[10] McComb D. A BFO-ready Version of gist [Internet]. Semantic Arts. Available from: https://www.semanticarts.com/wp-content/uploads/2025/01/20241024-BFO-and-gist-Article.pdf [11] McComb D. The Distictionary [Internet]. Semantic Arts. 2015 Feb. Available from: https://www.semanticarts.com/white-paper-the-distinctionary/

[12] Carey D. Avoiding Property Proliferation [Internet]. Semantic Arts. Available from: https://www.semanticarts.com/wp-content/uploads/2018/10/AvoidingPropertyProliferation012717.pdf [13] Trojahn C, Vieira R, Schmidt D, Pease A, Guizzardi G. Foundational ontologies meet ontology matching: A survey. Semantic Web. 2022 Jan 1;13(4):685–704, doi.org/10.3233/SW-210447

[14] Gambini B, Intelligence Community adopt resource developed by UB ontologists [Internet]. News Center. 2024 [cited 2025 Mar 30]. Available from: https://www.buffalo.edu/news/releases/2024/02/department-of-defense-ontology.html

[15] Euzenat J, Shvaiko P. Ontology Matching, 2nd edition. Heidelberg: Springer; 2013. doi:10.1007/978-3- 642-38721-0.

[16] GitHub [Internet]. Available from: https://github.com/semanticarts/gistBFO

[17] Prudhomme T, De Colle G, Liebers A, Sculley A, Xie P “Karl”, Cohen S, Beverley J. A semantic approach to mapping the Provenance Ontology to Basic Formal Ontology. Sci Data. 2025 Feb 17;12(1):282, doi:10.1038/s41597-025-04580-1

[18] Aameri B, Grüninger M. A New Look at Ontology Correctness. Logical Formalizations of Commonsense Reasoning. Papers from the 2015 AAAI Spring Symposium; 2015. doi:10.1613/jair.5339

[19] Shearer R, Motik B, Horrocks I. HermiT: A highly-efficient OWL reasoner. OWLED, 2008, Available from: https://ceur-ws.org/Vol-432/owled2008eu_submission_12.pdf.

[20] Glimm B, Horrocks I, Motik B, Stoilos G, Wang Z. HermiT: an OWL 2 reasoner. Journal of Automated Reasoning. 2014;53:245–269, doi:10.1007/s10817-014-9305-1

[21] Solimando A, Jiménez-Ruiz E, Guerrini G. Minimizing conservativity violations in ontology alignments: algorithms and evaluation. Knowl Inf Syst. 2017;51:775–819, doi:10.1007/s10115-016-0983-3 [22] Emeruem C, Keet CM, Khan ZC, Wang S. BFO Classifier: Aligning Domain Ontologies to BFO. 8th Joint Ontology Workshops; 2022.

[23] GitHub [Internet]. gist. https://github.com/semanticarts/gist

[24] Jensen M, De Colle G, Kindya S, More C, Cox AP, Beverley J. The Common Core Ontologies. 14th International Conference on Formal Ontology in Information Systems; 2024: doi:10.48550/arXiv.2404.17758.

Client 360 – A Foundational Challenge

October 16, 2025August 13, 2025 by Michael Atkin

Client 360 – A Foundational Challenge

When Lehman Brothers collapsed in 2008, CROs, CFOs and chief compliance officers were stuck pouring through annual reports and frantically searching within corporate documents to determine Lehman’s actual corporate structure – including who was bankrupt, who funded whom, who guaranteed what, and who would hold the obligations when everything was finally sorted out. It took an extraordinary 14 years after the collapse to find out due to the complexity of untangling a globally interconnected financial institution.

This legal entity identification problem during the 2008 financial crisis proved to be a systemic weakness that hampered the ability of regulators to understand and respond to what was happening in financial markets. Without a standard way of identifying financial institutions and their relationships to each other – it was near impossible to track interconnectedness, monitor risks, aggregate exposure, or coordinate regulatory responses.

KYC/AML

These challenges were not limited to systemic risk. Firms struggled to maintain consistent customer identification across their various lines of business and trading operations in order to meet their Know Your Customer (KYC) and Anti-Money Laundering (AML) obligations. The complexity of corporate structures makes it almost impossible to track fund flows, identify suspicious patterns or connect subsidiaries and affiliates across jurisdictions.

Client 360

The evolution from corporate entity identification to individual customer identification has been a natural progression in financial services as well as for many other industry sectors. The rise of the “customer 360” (better termed “client 360”) approach represents the goal of creating a complete and unified view of each customer across all business touchpoints. With fragmentation, however, an individual might have multiple accounts across different product lines using a host of name variations that result in missed opportunities for cross-selling or blindness in terms of relationship management.

“Customer” is a trigger word and has been one of the top data management challenges for companies since the beginning. The plethora of internal battles led to a simple conclusion – stop trying to harmonize the descriptors. There is no single view of customer. Every stakeholder’s definition is valid, just not the same. Focus on meaning, not words – make every person and organization the company touches an “entity” and assign to every entity a “role” (often multiple roles). Simple and elegant.

Root of the Problem

All three of these challenges – legal entity identification, KYC/AML and client 360 – stem from the same root problem: the inability to create consistent identity and meaning across various systems and databases. Each system speaks its own language and uses proprietary identifiers that become semantically incompatible data silos. Cross-border complications magnify the problem. These silos often number hundreds or thousands across large firms. Conventional approaches (deduplication, centralization, cross-referencing) have proven themselves to be unreliable.

This entity resolution challenge makes integration across sources extremely difficult and hampers understanding the relationship between clients, products, interactions, obligations and transactions. As a result, teams remodel the same entities in different systems. That makes it hard to reconcile. And within these divergent models, programmers use different terms for the same concept, use the same terms for different concepts or ignore important nuances altogether, making collaboration harder. These discrepancies and broken references are hard to detect across repositories. And while foreign keys and joins exist, they are often inconsistently modeled and poorly documented – requiring manual reconciliation by domain experts to find and fix data quality issues. The lack of entity (and meaning) resolution is risky, costly and totally unnecessary.

Semantic Standards as the Foundation

By addressing the challenges of entity and meaning resolution, organizations can aggregate all client data into a single, unified view. The most efficient and effective way to accomplish this is to put the data and the model at the center of the system. This is what we advocate as data-centric architecture – leveraging semantic standards and graph technology to ensure that applications conform to the data, not the other way around. Semantic interoperability is the key.

In a data-centric environment, we assign a unique identifier to every data concept. This enables firms to link data wherever it resides to one master ID – eliminating the need to continually move and map data across the enterprise. Rather than each system having its own definition of “customer,” “legal entity,” or “beneficial owner,” semantic standards ensure a shared understanding of requirements between business stakeholders and application developers.

As a result, systems can automatically understand and translate between different formats because everyone uses the same definitions for business concepts. When a new entity is created, the systems understand its place in the corporate hierarchy without additional mapping. Data and application models can be catalogued and mapped to ensure that users can find where the business concept resides. Instead of pulling data from multiple systems, data centric maintains a semantic model of each customer and their relationships. New data is automatically integrated based on semantic understanding rather than manual ELT processes. This data-centric approach becomes the foundational infrastructure for achieving Client 360.

Client 360 Maturity Cycle

Many in the financial services industry are already moving toward this vision, with leading institutions implementing knowledge graph solutions built on semantic standards. The migration from solving entity identification problems to enabling Client 360 represents more than technological evolution – it’s a fundamental shift toward semantic-first data architecture (without the rip and replace of traditional methods). Below is a three-level maturity guideline to help you implement a common language across your organization …

1. Maturity Level 1: Demonstration of Capability – This involves working with your SMEs to verify business requirements and build the business capability model. This goal of this maturity level will be to integrate at least two of your client-related datasets into a single model based on the client 360 ontology. The team will write and execute scripts to transform the data and test it for logic and reasoning validity. The result will be a core knowledge graph to enable key stakeholders to understand the query and analytical capabilities of data-centric by looking at their own data.

2. Maturity Level 2: Expanded Capability – This level focuses on harvesting additional customer datasets related to client 360. The goal is to link use cases (i.e. KYC, risk exposure analysis, CCAR, Basel III, FRTB, BCBS 239, Rule 4210, lineage traceability, cost of service, customer classification, profitability analysis, issue management, etc.) based on your internal priorities. This result will be an expanded domain ontology required to implement entity resolution and ensure conformance of the data to internal data service (DSAs) and service level agreements (SLAs).

3. Maturity Level 3 – Semantic Operations and GUI – Install a licensed, production-ready triplestore. You will rewrite RDF transformation scripts for your internal environment and set up data transformation workflows. This includes implementing change management approval processes, automated quality testing and entitlement controls. This should include training in expanding analytical and reporting capabilities as well as implementation of graphical user interfaces.

A Final Word

Starting your data-centric journey with legal entities and individuals is strategically sound as well as wise data policy. These “entities” represent the core actors in almost every business process. They are the primary subjects that most other data points relate to – they drive transactions, sign contracts, participate in supply chains and have a wide variety of relationships with your organization. We have learned that by adopting semantic standards for these foundational elements, you create a stable baseline upon which all other data relationships can be built.

By virtue of their centricity, restructuring your data environment as a connected infrastructure for organizations and people delivers immediate and tangible value across most departments and in terms of relationship management, reduced data duplication and enhanced regulatory compliance. Adopting data-centric standards for clients and legal entities is the first step in unraveling the critical connections that translate into better risk assessments and opportunity identification. Client 360 represents the path of least resistance – and one that delivers maximum initial impact for your organization.

Attribution 4.0 International

September 16, 2025July 31, 2025 by Joaquin Melara

Attribution 4.0 International

Creative Commons Corporation (“Creative Commons”) is not a law firm and does not provide legal services or legal advice. Distribution of Creative Commons public licenses
does not create a lawyer-client or other relationship. Creative Commons makes its licenses and related information available on an “as-is” basis. Creative Commons gives no
warranties regarding its licenses, any material licensed under their terms and conditions, or any related information. Creative Commons disclaims all liability for damages resulting from their use to the fullest extent possible.

Using Creative Commons Public Licenses

Creative Commons public licenses provide a standard set of terms and conditions that creators and other rights holders may use to share original works of authorship and other material subject to copyright and certain other rights specified in the public license below. The following considerations are for informational purposes only, are not exhaustive, and do not form part of our licenses.

Considerations for licensors: Our public licenses are intended for use by those authorized to give the public permission to use material in ways otherwise restricted by copyright and certain other rights. Our licenses are irrevocable. Licensors should read and understand the terms and conditions of the license they choose before applying it. Licensors should also secure all rights necessary before applying our licenses so that the public can reuse the material as expected. Licensors should clearly mark any material not subject to the license. This includes other CC- licensed material, or material used under an exception or limitation to copyright. More considerations for licensors: wiki.creativecommons.org/Considerations_for_licensors

Considerations for the public: By using one of our public licenses, a licensor grants the public permission to use the licensed material under specified terms and conditions. If the
licensor’s permission is not necessary for any reason–for example, because of any applicable exception or limitation to copyright–then that use is not regulated by the
license. Our licenses grant only permissions under copyright and certain other rights that a licensor has authority to grant. Use of the licensed material may still be restricted for other
reasons, including because others have copyright or other rights in the material. A licensor may make special requests, such as asking that all changes be marked or described. Although not required by our licenses, you are encouraged to respect those requests where reasonable. More considerations for the public: wiki.creativecommons.org/Considerations_for_licensees

Creative Commons Attribution 4.0 International Public License

By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution 4.0 International Public License (“Public License”). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions.

Section 1 — Definitions.

a. Adapted Material means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image.

b. Adapter’s License means the license You apply to Your Copyright and Similar Rights in Your contributions to Adapted Material in accordance with the terms and conditions of this Public License.

c. Copyright and Similar Rights means copyright and/or similar rights closely related to copyright including, without limitation, performance, broadcast, sound recording, and Sui Generis Database Rights, without regard to how the rights are labeled or categorized. For purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright
and Similar Rights.

d. Effective Technological Measures means those measures that, in the absence of proper authority, may not be circumvented under laws fulfilling obligations under Article 11 of the WIPO Copyright Treaty adopted on December 20, 1996, and/or similar international agreements.

e. Exceptions and Limitations means fair use, fair dealing, and/or any other exception or limitation to Copyright and Similar Rights that applies to Your use of the Licensed Material.

f. Licensed Material means the artistic or literary work, database, or other material to which the Licensor applied this Public License.

g. Licensed Rights means the rights granted to You subject to the terms and conditions of this Public License, which are limited to all Copyright and Similar Rights that apply to Your
use of the Licensed Material and that the Licensor has authority to license.

h. Licensor means the individual(s) or entity(ies) granting rights under this Public License.

i. Share means to provide material to the public by any means or process that requires permission under the Licensed Rights, such as reproduction, public display, public performance, distribution, dissemination, communication, or importation, and to make material available to the public including in ways that members of the public may access the material from a place and at a time individually chosen by them.

j. Sui Generis Database Rights means rights other than copyright resulting from Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, as amended and/or succeeded, as well as other essentially equivalent rights anywhere in the world.

k. You means the individual or entity exercising the Licensed Rights under this Public
License. Your has a corresponding meaning.

Section 2 — Scope.

a. License grant

Subject to the terms and conditions of this Public License, the Licensor hereby grants You a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license to exercise the Licensed Rights in the Licensed Material to:
- a. reproduce and Share the Licensed Material, in whole or in part; and
- b. produce, reproduce, and Share Adapted Material.
Exceptions and Limitations. For the avoidance of doubt, where Exceptions and Limitations apply to Your use, this Public License does not apply, and You do not need to comply with its terms and conditions.
Term. The term of this Public License is specified in Section 6(a).
Media and formats; technical modifications allowed. The Licensor authorizes You to exercise the Licensed Rights in all media and formats whether now known or hereafter created, and to make technical modifications necessary to do so. The Licensor waives and/or agrees not to assert any right or authority to forbid You from making technical modifications necessary to exercise the Licensed Rights, including technical modifications necessary to circumvent Effective Technological Measures. For purposes of this Public License, simply making modifications authorized by this Section 2(a) (4) never produces Adapted Material.
Downstream recipients.
- a. Offer from the Licensor — Licensed Material. Every recipient of the Licensed Material automatically receives an offer from the Licensor to exercise the Licensed Rights under the terms and conditions of this Public License.
- b. No downstream restrictions. You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material.
No endorsement. Nothing in this Public License constitutes or may be construed as permission to assert or imply that You are, or that Your use of the Licensed Material is, connected with, or sponsored, endorsed, or granted official status by, the Licensor or others designated to receive attribution as provided in Section 3(a)(1)(A)(i).

b. Other rights

Moral rights, such as the right of integrity, are not licensed under this Public License, nor are publicity, privacy, and/or other similar personality rights; however, to the extent possible, the Licensor waives and/or agrees not to assert any such rights held by the Licensor to the limited extent necessary to allow You to exercise the Licensed Rights, but not otherwise.
Patent and trademark rights are not licensed under this Public License.
To the extent possible, the Licensor waives any right to collect royalties from You for the exercise of the Licensed Rights, whether directly or through a collecting society under any voluntary or waivable statutory or compulsory licensing scheme. In all other cases the Licensor expressly reserves any right to collect such royalties.

Section 3 — License Conditions.

Your exercise of the Licensed Rights is expressly made subject to the following conditions.

a. Attribution.

If You Share the Licensed Material (including in modified form), You must:
a. retain the following if it is supplied by the Licensor with the Licensed Material:
- i. identification of the creator(s) of the Licensed Material and any others designated
  to receive attribution, in any reasonable manner requested by the Licensor (including by
  pseudonym if designated);
- ii. a copyright notice;
- iii. a notice that refers to this Public License;
- iv. a notice that refers to the disclaimer of warranties;
- v. a URI or hyperlink to the Licensed Material to the extent reasonably practicable;

b. indicate if You modified the Licensed Material and retain an indication of any previous modifications; and

c. indicate the Licensed Material is licensed under this Public License, and include the text of, or the URI or hyperlink to, this Public License.

You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on the medium, means, and context in which You Share the Licensed Material. For example, it may be reasonable to satisfy the conditions by providing a URI or hyperlink to a resource that includes the required information.
If requested by the Licensor, You must remove any of the information required by Section 3(a (1)(A) to the extent reasonably practicable.
If You Share Adapted Material You produce, the Adapter’s License You apply must not prevent recipients of the Adapted Material from complying with this Public License.

Section 4 — Sui Generis Database Rights.

Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the Licensed Material:

a. for the avoidance of doubt, Section 2(a)(1) grants You the right to extract, reuse, reproduce, and Share all or a substantial portion of the contents of the database;

b. if You include all or a substantial portion of the database contents in a database in which You have Sui Generis Database Rights, then the database in which You have Sui Generis Database Rights (but not its individual contents) is Adapted Material; and

c. You must comply with the conditions in Section 3(a) if You Share all or a substantial portion of the contents of the database.

For the avoidance of doubt, this Section 4 supplements and does not replace Your obligations under this Public License where the Licensed Rights include other Copyright and Similar Rights

Section 5 — Disclaimer of Warranties and Limitation of Liability.

a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.

b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.

c. The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability.

Section 6 — Term and Termination.

a. This Public License applies for the term of the Copyright and Similar Rights licensed here. However, if You fail to comply with this Public License, then Your rights under this Public License terminate automatically.

b. Where Your right to use the Licensed Material has terminated under Section 6(a), it reinstates:

1. automatically as of the date the violation is cured, provided it is cured within 30 days of Your discovery of the violation; or

2. upon express reinstatement by the Licensor.

For the avoidance of doubt, this Section 6(b) does not affect any right the Licensor may have to seek remedies for Your violations of this Public License.

c. For the avoidance of doubt, the Licensor may also offer the Licensed Material under separate terms or conditions or stop distributing the Licensed Material at any time; however, doing so will not terminate this Public License.

d. Sections 1, 5, 6, 7, and 8 survive termination of this Public License.

Section 7 — Other Terms and Conditions.

a. The Licensor shall not be bound by any additional or different terms or conditions communicated by You unless expressly agreed.

b. Any arrangements, understandings, or agreements regarding the Licensed Material not stated herein are separate from and independent of the terms and conditions of this Public License.

Section 8 — Interpretation.

a. For the avoidance of doubt, this Public License does not, and shall not be interpreted to, reduce, limit, restrict, or impose conditions on any use of the Licensed Material that could lawfully be made without permission under this Public License.

b. To the extent possible, if any provision of this Public License is deemed unenforceable, it shall be automatically reformed to the minimum extent necessary to make it enforceable. If the provision cannot be reformed, it shall be severed from this Public License without affecting the enforceability of the remaining terms and conditions.

c. No term or condition of this Public License will be waived and no failure to comply consented to unless expressly agreed to by the Licensor.

d. Nothing in this Public License constitutes or may be interpreted as a limitation upon, or waiver of, any privileges and immunities that apply to the Licensor or You, including from the legal processes of any jurisdiction or authority.

=======================================================================

Creative Commons is not a party to its public licenses. Notwithstanding, Creative Commons may elect to apply one of its public licenses to material it publishes and in those instances will be considered the “Licensor.” The text of the Creative Commons public licenses is dedicated to the public domain under the CC0 Public Domain Dedication. Except for the limited purpose of indicating that material is shared under a Creative Commons public license or as otherwise permitted by the Creative Commons policies published at creativecommons.org/policies, Creative Commons does not authorize the use of the trademark “Creative Commons” or any other trademark or logo of

Creative Commons without its prior written consent including, without limitation, in connection with any unauthorized modifications to any of its public licenses or any other arrangements, understandings, or agreements concerning use of licensed material. For the avoidance of doubt, this paragraph does not form part of the public licenses.

Creative Commons may be contacted at creativecommons.org.

Semantic Arts’Secret Sauce

September 18, 2025July 17, 2025 by Joaquin Melara

Semantic Arts’Secret Sauce

An organization founded by an individual or small group is deeply shaped by the priorities and capabilities of its founder(s), as well as the market and industry it enters. The baseline requirement for any startup is to identify and meet the needs of specific types of customer or clients to sustain and grow financially. How an organization chooses to do that, how it responds to feedback that validates or challenges its value proposition, and how it adapts over time, each could be the subject of a thesis on its own.

To understand an organization, you might take what is written in a business plan at face value, consult the company charter to understand its aspirations, or evaluate performance through public-facing news, press releases, and financial reports to get a sense of its real-world impact. But we are not here to get lost in all that, or to bore you with the generic, hollow promises often engraved inside generic institutions.

In this paper, we want to turn our house into glass to share what makes this professional services firm so darn special to us and our clients. We will start by sharing some background on our newly minted employee-governance operating structure, why it matters for our culture and growth, the organizational foundations behind it, and a few internal practices that foster trust, knowledge sharing, and organizational alignment in our day-to-day work.

Our New Employee-Governance Model

As of 2025, Semantic Arts is an employee governed company. It is not employee owned, as enlightened as that might sound, as that is still a risk factor for premature death. We have watched that situation unfold right here in our own backyard (Fort Collins, Colorado). The New Belgium Brewing Company, of the famous Fat Tire Ale, became employee owned in 2012. For several years it was one of the darlings of the ESOP movement. But by 2019 the employees succumbed to the same temptation that traditional owners face: the lure of the exit, when they sold to the Kirin company. Semantic Arts is now owned by a perpetual benefit trust. The company pays a small royalty to the
trust in exchange for the privilege of continuing the execution of a company that has great people, proven methodology, and an excellent reputation. As part of the transition to self-governance, Mark Wallace, a long-time senior consultant with the firm stepped up to the role of President, while Dave transitioned to CEO.

Employee-Governance Supports Culture and Growth

There are some interesting and subtle things that our structure does to provide alignment.

The first, that has been in place for over a decade, helps align the incentives of the company with the incentives of the employee.

The second, which is just being implemented, more closely aligns the interests of our clients with the interests of Semantic Arts.

Employee and Semantic Arts Alignment

Most companies, at some level, are at odds, to some degree with their employees. If they can pay the employees less, they will make more profit. If they can outsource the work, it improves their margins. And when it comes to promotions, there are a limited number of boxes in the upper rungs of the org chart.

At Semantic Arts we have created an incentive pay system that aligns the company’s goals with the employees. As employees become more skilled, we can charge more for those skills in the marketplace. The incentive system continually adjusts their pay without any negotiation. The company also makes more money from the upskilled employees. As a result, the company is continually investing in training and coaching. And while there are positions, there is no limit to how many people can be in any of them.

Client and Semantic Arts Alignment

In a comparable way, most consulting firms, really most types of firms, have some degree of inherent conflict with their clients. If they can charge a bit more for an engagement, it goes to the bottom line.

Because of our employee alignment, we often do something that other consulting firms would not think of doing, and if they did think of it, they would not do it. When we take on a fixed price engagement and finish ahead of budget, most firms would pocket the difference, indeed that is the reward for taking the risk. But because of our employee incentives, in those cases the employee did not have the opportunity to earn the full amount of the contract. As standard practice, we try to find additional work that we can do for the client, to produce meaningful results within the original budget.

Additionally, our new trust arrangement adds another level of alignment. Professional Services firms make money off the difference between what they charge for their consultants and what they pay them. The owner of a professional services firm is incented to maximize this difference, not just to provide distributions to the owners, but also it is the main factor in determining the ultimate sale price of the firm, on exit.

Clients know this. They know most consultants list price fees that are inflated and negotiate aggressively accordingly. They believe the reduction in fees comes out of the margin. We have shared with our clients that it does not work that way here. At Semantic Arts, the reduction in fees comes from the consultants pay (not immediately and directly, but it factors strongly). Now we have further increased our alignment. The perpetual trust cannot be sold. We do not inflate our rates to try to boost our profits. As a result, there is no incentive to increase the margin to increase the final sale value. And the trust is receiving its payment off the top line not the bottom line, so the trust would rather see growth than profit.

Besides, there is no real mechanism for distributing profit. Any profit we gain through client engagements is retained to tide us over through lean times. From a practical point of view, the firm will operate like a non-profit. That is because we deeply value and protect the reputation we have built as honest brokers.

You might ask, “Why go through the trouble of creating a money management mechanism within the organization instead of simply maximizing profitability?” While most firms aim to maximize profits and then exit, our philosophy is a little different, as you will find it is reflected in our vision, mission, and values.

Semantic Arts Organizational Foundations

At Semantic Arts, the cultural line between management and employees is nearly invisible in dayto-day activities, thanks to the high degree of autonomy and transparency built into the
organization’s operational framework.

Our Vision:

We want to be a part of a world where any employee of an organization has easy access to all the
information they need to do their job, subject to authorization. Data is not siloed away in hundreds
or thousands of application-specific systems but is presented in a data-centric fashion. We have built methodology, tools, and quality assurance processes to make sure we continue to deliver the highest possible quality outcome.

Our Mission:

Our motivation when we get out of bed is to transform as many organizations as possible.

Our Shared Values are:

Create value: All our projects should create value for our clients. We are not trying to setup situations where our clients pay us because we have coerced them or made them dependent on us.
One voice: We can disagree violently in the office but will present our best consensus when we are with the client.
Share equitably: We focus on ways to share the results of our efforts with clients, partners, and employees. Equitable sharing is based on contributions.
Lifetime learning: We value continual study and research to look for better ways, and to stay current in our discipline, rather than milking whatever expertise we have.
Show respect: While focusing on results, we remain mindful of the contribution, options, and rights of others. In all our dealings, we need to be humble.
Have fun: We are not here to suffer for the sake of the company. We can accomplish all we set out to, stay true to our values, have a good time, and take time to enjoy our lives.

6 Semantic Arts Organizational Practices

A critical component of our core competencies and strategic advantage in implementing data centric transformations is rooted in the internal practices we have embedded into our culture. These practices enable us to blend the experience, skills, and strengths of our collective into a cohesive unit, allowing us to act like a stick of dynamite for our clients’ toughest problems.

Weekly Staff Meetings: Twice a week, we hold company-wide staff meetings to report on active project work, discuss ways to deliver more value to our clients, and address any
technical or operational challenges that arise. These meetings offer everyone an opportunity to engage meaningfully in current and future activities, while also serving as a vehicle for learning through osmosis.
Knowledge Exchange: Once a week, we hold open Q&A sessions where employees are encouraged to share challenges they are facing or present new ideas. These sessions
quickly become a rich forum for collective problem solving, offering a fast and effective way to get up to speed on a variety of issues, and the solutions that address them.
gist Development Meeting: One of our most valuable internal resources and core competencies is the continuous use and refinement of gist, our open-source upper ontology, in client projects. Over the past decade, we’ve leveraged gist in more than 100 engagements to accelerate time-to-value and deliver impactful results. With each project, we have applied gist across diverse domains (such as pharma, manufacturing, cyber security, and finance), allowing us to iteratively refine our knowledge and approach. Each month, we actively evolve gist based on our collective learnings and community feedback.
Friday Sessions: We reserve a weekly session on Friday for our consultants or select invited guests to deliver a prepared presentation on a topic of their choice and expertise.
The goal is to infuse our team with insights on cutting-edge technologies, whether from technical vendors, consultants in complementary areas, or internal projects we want to share with the broader company.
Heavy Investments into R&D: We have superb technical talent with deep expertise in ontology modeling, software engineering, and the end-to-end development of enterprise knowledge graphs, from design through to live production environments. This core differentiator of technical excellence is shared openly and cross-pollinated across a wide
range of interests and domains. Individual curiosity, combined with the resolution of new or recurring technical challenges, results in reusable scripts, design patterns, operational strategies, and innovative ways to deliver greater value in less time.
Drinking our Champagne: A natural consequence of our heavy investment in R&D activities and projects is that we have had the opportunity to make, and drink, our own champagne throughout the entire lifespan of Semantic Arts. Our focus on reducing the effort required to deliver value to clients, combined with a commitment to continuously elevating our practice, has fostered a culture of ongoing innovation, for our people, processes, and tooling.

The Secret Sauce

The people who make up this organization, our ongoing practices, and the incredibly rich and stimulating client work we engage in are part of what makes working at Semantic Arts feel like you have stepped back in time to the intellectually rigorous forums of ancient Greece.
Iron sharpens iron; that’s our way of life.

At Semantic Arts, you do not need to wear a toga to feel like a philosopher, nor a lab coat to feel like a research scientist. But you will need to get used to Hawaiian shirts on Fridays, and spending hours wrestling with deeply stimulating intellectual challenges.

Both our clients and our people are better because of them.

And if you want to get an inside scoop into some of the history, origin story of our founder Dave McComb, as well as learn the top 3 lessons that have kept us surviving and thriving, visit our about us page here: https://www.semanticarts.com/about-us/.

Stay in Touch with Us

We have an open-door policy here at Semantic Arts, with our staff and clients.
If you’d like to stay in the loop or engage with us in a more formal discussion, we’ll always make time to talk; see below to find out how!

For general interest, we have a newsletter:

For practitioners and information seekers, we have community events:

Sign up to take part in the Estes Park Group, a non-commercial monthly review of topics around data-centric architecture: https://lp.constantcontactpages.com/sl/fe6usRp/EstesParkGroup

Sign up to take part in the gist Forum, a monthly discussion around gist related developments and presentations: https://lp.constantcontactpages.com/sl/e1DBewb/gistForum

For prospective employees, we have a recruitment process:

Review the job description to make sure you are a fit: Ontologist Job Description
Complete our job application found here: Semantic Arts Ontologist Application
Email your resume to [email protected] (after completion of the application)

For clients, we have our contact information:

Contact JT Metcalf, Chief Administrative Officer at [email protected] to provide us with context and more information around your priorities, roadmap, and any efforts that your organization has already conducted. We’ll find a suitable time to have a conversation and shape out a clear path forward.

Should you have any other inquiries, you can call us at (970) 490-2224

Semantic Arts’ 25 Year History

October 16, 2025July 17, 2025 by Joaquin Melara

Semantic Arts Enters Its’ 25th Year

According to the U.S. Bureau of Labor Statistics, 15,336 companies were founded in Colorado in the year 2000. By 2024, only 2,101 of those companies remained. While we can speculate endlessly about why just ~14% survived recessions, pandemics, and international conflicts, the impression is clear.

The organizations that endured… and ideally thrived deserve recognition for their adaptability and resourcefulness. Seeing how we are entering our 25th anniversary, this statistic is a big deal. Most companies do not make it that long.

70% of companies fail in their first 10 years.

Even the venerable S&P 500 companies have an average lifespan of 21 years.

Resilience is in Our DNA

So here we are at 25, just getting warmed up.

To make things even more interesting, it is cool to be 25 years in an industry that many people think is only a few years old. Many people have only recently stumbled into semantics based on open standards, knowledge graphs, and data-centric thinking, and are surprised to find a company that has been specializing in this since before Facebook was founded.

It hasn’t always been easy or a smooth ride, but we like to think longevity is in our DNA.

Keep reading for a look at three of the most important lessons we’ve learned, a brief tour of our biggest achievements over the past 25 years, a glimpse of where we’re windsurfing to next, and as a bonus for reading through the entirety of our history, we’ll give you an inside scoop on Dave McComb’s origin story leading up to the founding of Semantic Arts.

3 Lessons Learned Surviving 25 Years

You learn a few things after surviving and eventually thriving for 25 years.

After you learn them and then state them, they often sound obvious and trivial. The reality is that we had to learn them to get to where we are today. We hope it serves you as much as it has served us.

LESSON 1:

Becoming data-centric is more of a program than a project. It is more of a journey and process than a destination or product.

We’ve observed a consistent pattern among our clients: once they discover the data-centric approach, they want it immediately. But meaningful transformation requires rethinking deeply held beliefs and shedding long-standing stigmas. This paradigm shift challenges cultural norms, restructures how information assets are organized, and redefines how knowledge is shared (in more meaningful and context-rich ways).

We’ve also seen what happens when organizations resist the data-centric shift. Despite initial interest, they cling to legacy mindsets, siloed systems, and entrenched hierarchies. The transformation stalls because cultural resistance outweighs technical readiness. Information remains fragmented, knowledge-sharing stays shallow, and AI initiatives struggle to produce meaningful results, often reinforcing the very inefficiencies the organization hoped to overcome.

LESSON 2:

Successful data-centric transformations require you to simultaneously look at the big picture and the fine-grain details.

Through decades of execution (and refinement of that execution), we employ a “think big” (Enterprise) / “start small” (Business Domain) approach to implementing data-centric

architecture. We advocate doing both the high-level and low-level models in tandem to ensure extendable and sustained success.

If you only start small (which every agile project advises), you end up recreating the very point solutions and silos you’re working to integrate. And only thinking big tends to build enterprise data models that do not get implemented (we know, because that’s where we started).

Doing both simultaneously affords two things that clients appreciate.

1. It demonstrates a solution to a choice problem set, by leveraging real production data, and in a way that a skeptic can understand

2. It performs in a way that ensures future proofing while avoiding vendor lock-in. After the first engagement with a client, each new project will fit into the broader data-centric architecture and will be pre-integrated. This work can later be re-used and leveraged to extend the ontological model.

LESSON 3:

To instill confidence, you need to prove value through a series of projects validating the utility of the data-centric paradigm.

Most of our clients re-engage us after the initial engagement to guide in the adoption. Generally, we extend the engagement by bringing our approach to more sub-domains. While in parallel, we help a client think through the implementation details of the architecture by modeling the business via an ontology and contextually connecting information with a semantic knowledge graph.

Part of the magic of our modular approach to extending a knowledge graph is that each newly integrated subdomain expands the limitless applications of clean, well-structured, and verified data. The serendipitous generation of use cases can’t be planned (as they are not always obvious),but it often creates opportunities that delight our clients and exceed their expectations.

Let’s take a text-guided tour of what led us to these conclusions, as well as the events that shaped our history.

A Historical Account of Semantic Arts

If we look at the official registration date with the Colorado Secretary of State, Semantic Arts was formed on August 14, 2000. However, reality is rarely as clear-cut as what’s captured on paper. In fact, we had already been operating loosely as Semantic Arts for several months prior.

Stick around, and we’ll take you through the journey, from August 2000 to the time of this writing, August 2025.

FOUNDING & EARLY EXPLORATION (2000)

In 2000, the idea of applying semantics to information systems was just beginning to gain traction, with emerging efforts like SHOE, DAML, and OIL.
Leaning into this promising field, the company was aptly named Semantic Arts and served as a vessel through which contracts flowed through to the consultants, all of whom were subcontractors.
There was virtually no demand for semantic consulting, largely due to a lack of understanding of what “semantic” even meant, so Semantic Arts focused on delivering traditional IT consulting projects (such as feasibility studies and SOA message modeling), often embedding semantic models behind the scenes to build internal capabilities.

THE 1ST SEMANTIC WAVE NEVER CAME (2001–2002)

In 2001, the “Semantic Web” was formally introduced by Tim Berners-Lee, Jim Hendler, and Ora Lassila in Scientific American, and given Berners-Lee’s legacy as the inventor of the World Wide Web, excitement soared.
On surface, it appeared that Semantic Arts was poised to ride what seemed to be the next monster wave, but the wave never came. • Despite the hype, potential clients remained unaware or uninterested in semantics, and adoption stagnated.

BOOKS, CLIENTS, AND THE BIRTH OF gist (2002–2004)

From 2002 to 2003, while Dave McComb authored Semantics in Business Systems: The Savvy Manager’s Guide, while Semantic Arts primarily sustained itself through contracts with the State of Washington.
Behind the scenes, Semantic Arts developed semantic models for departments such as Labor & Industry and Transportation, and it was during the Department of Transportation project that gist, the open-source upper ontology, was born.
A small capital call in 2003 helped keep Semantic Arts viable, with Dave McComb becoming majority owner, and Simon Robe joining as the minority shareholder.

EVANGELISM WITHOUT DEMAND (2005–2007)

From 2005–2012, Semantic Arts produced the Semantic Technology Conference and simultaneously began teaching how to design and build business ontologies.
Despite the proactive outreach efforts, the market remained indifferent.
During this time, an ontology for Child Support Enforcement in Colorado was created, but clients were still largely unreceptive to semantic technologies.

THE FIRST WAVE OF REAL DEMAND (2008–2011)

In 2008, interest in semantics began to emerge with Sallie Mae being among the first to seek an ontology for a content management system.
Semantic Arts advised the team to build a Student Loan Ontology instead, a decision that proved critical when legacy systems could not support a new loan type, marking the first real demonstration of the serendipitous power of semantics.
Other clients soon followed: Lexis Nexis (their next generation Advantage platform), Sentara (healthcare delivery), and Procter & Gamble (R&D and material safety).

FROM DESIGN TO IMPLEMENTATION (2012–2016)

By 2012, Semantic Arts had matured into a premier ontology design firm; however, increased efficiency meant projects became smaller, and few enterprises required more than one enterprise ontology.
A pivotal change occurred when an intern transformed the internal timecard system into a graph-based model, which became the prototype for Semantic Arts’ first implementation project, partnering with Goldman Sachs to solve a “living will” regulatory challenge.
This era saw deeper implementations, including a product catalog for Schneider Electric in partnership with Mphasis, and marked the period when Dave McComb eventually bought out Simon Robe to become the sole owner of Semantic Arts.

SCALING THE DATA-CENTRIC MOVEMENT (2017–2019)

By 2017, implementation projects had overtaken design as Semantic Arts’ core business, and feedback from those projects helped rapidly evolve gist, with clients including Broadridge, Dun & Bradstreet, Capital One, Discourse.ai (now TalkMap), Euromonitor, Standard & Poor’s, and Morgan Stanley.
Dave McComb published Software Wasteland, followed by The Data-Centric Revolution, both of which galvanized interest in reforming enterprise modeling.
Up to this point, Semantic Arts was primarily composed of highly experienced senior ontologists and architects, but with the growth of implementation work, they developed repeatable methodologies and began hiring junior ontologists and developers to support delivery at scale.

INSTITUTIONALIZING THE VISION (2020–2024)

Around 2020, Semantic Arts realized that version 1.0 of the model driven system was not going to satisfy the increasing demands, so work began on a more ambitious version 2.0 (code named Spark) to begin development of a low-code, next-generation model-driven system.
In parallel, implementation work toward data-centric transformations continued at pace with clients including Morgan Stanley, Standard & Poor’s, Amgen, the Center for Internet Security, PricewaterhouseCoopers, Electronic Arts, PCCW (Hong Kong Telecom), Payzer, Juniper Networks, Wolters Kluwer, and the Institute for Defense Analyses.
At some point, Semantic Arts decided that the industry needed some companies that could become fully data-centric in a finite amount of time, which led to further self experimentation, and in an unplanned way yielded towards data-centric accounting, and the book promoting it, Real-Time Financial Accounting: The Data-Centric Way, by Dave McComb and Cheryl Dunn to be published in late 2025.

THE NEW SELF-GOVERNANCE OPERATING MODEL (2025)

In 2025, Semantic Arts entered a new era of self-governance as ownership transferred to the Semantic Arts Trust, secured by a royalty agreement that ensures independence from market acquisition.
The firm is now guided by a five-person Governance Committee, responsible for key deliberative functions such as budgeting, staffing levels, and strategic direction, alongside a new President (Mark Wallace), who leads day-to-day strategic execution.
One of the first key initiatives in transitioning to this self-governance model is to improve the discipline and repeatability of the marketing and sales functions, making the pipeline of new work more predictable.

If you’re interested in learning more about why we transitioned into an employee-governed company, we’ll leave you in suspense just a little while longer. We’re currently writing a companion article to this one, where we’ll share more about the company’s secret sauce, cultural DNA, and what makes Semantic Arts as unique and bespoke as the work we do for our clients.

You can find more information on our about us page here: https://www.semanticarts.com/about-us/

Looking towards the Future

As we reflect and prose on the last 25 years, we adjust our sails to ride the wind of our lessons into the next 25 years. We have a plan. It is not set in stone, but it is surprising how many things have remained constant over these last few decades, and we anticipate them staying constant into the future.

Most software companies operate hockey-stick business plans that forecast explosive growth over the next few years. If you’re a software firm, that pace is both possible and desirable. But as a professional services firm, there is a natural limit to how fast we can, and should grow. We’ve seen that natural growth limit in other professional services firms, and we’ve experienced it ourselves. We think that the limit is around 25% per year. Under that number, culture and quality can still be maintained, even as a firm grows.

We’ve chosen the slightly more ambitious 26% per year as our target. 26% yearly growth is the number that results in a firm doubling in size every three years. We won’t always hit this exact target, but it is what we are aiming for. Afterall, the vast backlog of legacy applications, combined with the continuing accumulation of new legacy systems, suggests that we will have meaningful, high-impact work for far longer than 25 years.

If you’re a history buff, you might appreciate learning a thing or two about Dave McComb’s origin story. His professional background deeply shaped the DNA of Semantic Arts and continues to influence how it functions today.

Dave McComb’s Origin Story

Since we’re reviewing Semantic Arts’ history in 25-year increments, we’ll do the same with Dave, starting in 1975 and leading up to the founding of Semantic Arts. Like a skyscraper, an organization can only rise as high as its foundation is strong, and thanks to Dave’s remarkable background and expertise, Semantic Arts has been built into a truly exceptional organization.

BREAKING INTO THE REAL WORLD (1975 – 1979)

Dave started his career in software in 1975, teaching the class “The Computer in Business” at Portland State University while getting his MBA.
The same year, he got his first paid consulting gig, for an architectural firm (maybe that’s the source of his fascination with architectural firms); to computerize the results of some sort of survey they had issued for a whopping $200 fixed price bid.
He joined Arthur Andersen (the accounting firm) in their “Administrative Division,” which would become Andersen Consulting and eventually Accenture.
Five years of building and implementing systems for inventory management, construction management, and payroll, he was made a manager and shipped off (in a plane) to Singapore.
After rescuing a plantation management system project that was going badly, he ended up in Papua New Guinea (no good deed goes unpunished).

BUILDING AN ERP SYSTEM FROM SCRATCH (1980 – 1989)

On the island of Bougainville, Papa New Guinea was home to what was, at the time, the world’s largest copper and gold mine.
Their systems were pathetic, and so, Dave launched a project to build an ERP system from the ground up (SAP R/2 did exist at the time but was not available on the ICL mainframes that ran the mine).
The plan was fairly audacious: to build out a multi-currency production planning, materials management, purchasing and payables system of some 600 screens and 600 reports with 25 people in two years.
The success of that project was mostly due to partially automating the implementation of use cases.

AI BEFORE IT WAS COOL (1990 – 1994)

Around 1990, Dave returned to the U.S. and was tasked with delivering another custom ERP system, this time for a diatomaceous earth mine of similar size and scope as the previous mine in Papa New Guinea.
In this project, there was even more automation leveraged, in this case 98% of the several million lines of code were generated (using artificial intelligence in 1991).
Around this time, Dave started the consulting firm First Principles, Inc.
One of the anchor clients was BSW, the architectural firm that designed all the Walmarts in North America, and it was on this project, in 1992, that First Principles decided to apply semantics to the design of database systems.

TURNING A CORNER AT THE END OF THE CENTURY (1995-1999)

First Principles, was rolled into Velocity Healthcare Informatics, a dot com era healthcare software company. • Velocity Healthcare Informatics built and patented the first fully model-driven application environment, where code was not generated, but behavior was expressed based on information in the model.
Alongside this new model-driven application, the nascent semantic methodology evolved and was grafted onto an Object-Oriented Database.
Velocity Healthcare Informatics created a semantic model of healthcare that, in 1999, the medical director of WebMD said, after a multi-hour interrogation of his team, “I wish we had that when we started.”
Velocity Healthcare Informatics built several systems in this environment, including Patient Centered Outcomes, Case Management, Clinical Trial Recruiting and Urology Office Management.
Towards the turn of the century, Velocity Healthcare Informatics was preparing for the road show to go public in March of 2000 when the dot com bubble burst.
Velocity Healthcare Informatics imploded in a way that intellectual property could not be salvaged, and as a result, several of the employees jointly formed a new company in the late spring of 2000.

Financial Services Regulatory Issue Brief

October 16, 2025July 15, 2025 by Michael Atkin

Financial Services Regulatory Issue Brief

Leading analysts all share a similar view about global financial regulatory priorities. Complexity will continue to increase with geopolitical events and regulatory fragmentation on the rise. The global economic environment will remain a key concern. There will be a push toward harmonized enforcement from financial crime and sanctions from war. And the new (and unique) risks from AI will be a rising part of the regulatory agenda. This will translate into increased regulatory scrutiny with emphasis on enterprise resilience, risk management, cross-border data flow, low tolerance for poor governance and more prudential scrutiny.

From a data perspective, these trends are driving the focus on data standards, granularity of reporting and interoperability across systems. Now is the time to rethink your approach to data management by putting your enterprise information into a knowledge graph where it is reusable, traceable, accessible and flexible. We found in over 20 major financial services projects that this is both achievable and productive. You can’t be first, but you can be next ([email protected])

Key Regulatory Initiatives

Basel III

With endgame rules nearing finalization, financial institutions will need to step up preparations for the remaining Basel reforms as well as the long-term debt requirements. Variations in local approaches will add to the complexity.

T+1 Settlement

Compressing the settlement date is a response to the regulatory concern that “nothing good can happen between trade date and settlement”.This means pressure on accuracy and timeliness of data including links to legal entity relationships, risk metrics and trade corrections.

Books of Records

Data-centric architecture enables firms to place the client at the heart of operations. The key is the ability to rationalize data from multiple sources (i.e., IBOR, ABOR, PBOR, reference data) for advanced analytics and reporting.

Prudential Oversight

Banking authorities have an ambitious agenda including proposed changes to capital, resolution planning, solvency and supervision. These will require building effective control frameworks and prepare for new regulation on liquidity, capital requirements and stress testing.

FDTA Standards

The Financial Data Transparency Act is a new law designed to modernize the collection and sharing of financial data. The focus is on the adoption of machine-readable standards that are searchable without any loss of semantic meaning. FSOC are taking semantic data standards seriously. Joint rulemaking is forthcoming this September.

Data Implications

The most effective way of responding to these trends is to adopt data standards that were specifically
designed to address the challenges created by technology fragmentation. The goals are to ensure consistency, precision and granularity of data as it flows across processes and to promote flexibility in support of ad hoc (scenario-based) analysis.

These include the adoption of standard identifiers for all internal and special purpose IDs … the capture of precise and unambiguous meaning through well-engineered ontologies … and the expression of both identity and meaning in the language of the Web (i.e., IRI for identity, RDF for meaning and SHACL for business/logic rules).

1. Integration: The most foundational objective relates to harmonization of data across repositories. Organizing information using standards enables you to navigate across data sets and understand the web of relationships needed to identify risks, comply with new regulations and perform resiliency planning.

2. Entity Resolution: By defining meaning via the ontology and linking it to the standard identifier, you can track the origin, transformation and flow of data. This transparency into your data processes allows you to link glossaries, business rules and conceptual models to prove policy compliance to auditors.

3. Data Quality: By organizing metadata and lineage into a structured graph, you gain a composite view of data assets to identify anomalies and deviations from expected data patterns and benchmark them against data quality rules.

4. Compliance: By standardizing meaning and classifications you promote a shared understanding of regulatory requirements across stakeholders. This allows you to trace regulatory dependencies, ensure understanding of legal requirements and streamline regulatory reporting.

5. Entitlements: By organizing users, roles and permissions in a knowledge graph, control officers can create granular access control policies that specify and automate who can access what resources under which conditions.

Semantic Arts

Semantic Arts has been 100% focused on semantics, knowledge graph and ontology design for over two decades. It took hundreds of projects across dozens of industries for us to conclude that data-centric is the only reliable way to harmonize data across the enterprise.

We have perfected a methodology that allows clients to migrate toward data-centric one sub-domain at a time. We start with a simple model of all the information managed by your line of business. This forms the scaffolding that is used to add additional projects on an incremental basis. We then work with you to add capability to the knowledge graph architecture in terms of visualizations and natural language search capability.

Talk to us. We understand financial services and have a proven track record of helping financial clients including Broadridge, Capital One, Citi, Credit Suisse, Federal Reserve Bank NY, Freddie Mac, Goldman Sachs, JP Morgan, Morgan Stanley, Nationwide, Sallie Mae and Wells Fargo.

Note: The predictable response to this from some firms will be to double down on more expensive fire drills, war rooms and ad hoc solutions to these endemic problems. The alternative is to deal with the root cause — the massive fragmentation of your data landscape – rather than the symptoms. Take a page from the playbook of the digital natives and adopt data-centric knowledge graphs.

Semantic Arts, Inc. Celebrates its 25th Anniversary

September 18, 2025July 11, 2025 by Joaquin Melara

Semantic Arts, Inc. Celebrates its 25th Anniversary

Pioneering Data-Centric Transformations to Modernize IT Architecture, Advance Knowledge Systems, and Enable Foundational AI

CONTACT INFO:

Dave McComb

Phone: (970) 490-2224

Email: [email protected]

Website: https://www.semanticarts.com/

Fort Collins, Colorado – August 14, 2025:

According to the U.S. Bureau of Labor Statistics, 15,336 companies were founded in Colorado in the year 2000. By 2024, only 2,101 of those companies remained. While we can speculate endlessly about why just ~14% survived through recessions, pandemics, and international conflicts, the impression is clear. Organizations that endured, and ideally thrived, deserve recognition for their adaptability and resourcefulness.

On August 14, Semantic Arts is celebrating its 25th year of leading the data-centric revolution. In those 25 years, we have undergone a long and treacherous journey from a one-person consultancy to a globally respected firm.

Throughout that time, Semantic Arts has guided organizations to unlock the power of semantic knowledge graphs to structure, manage, and scale their data. With over 100 successfully completed projects, we have refined our “Think Big, Start Small” approach, aligning strategic priorities with high-impact use cases where knowledge graphs create measurable value. And as a result, we have come to specialize in end-to-end, design and implementation of enterprise semantic knowledge graphs.

Company CEO, Dave McComb remarked: “Our 25-year journey has proven that while technologies evolve, the core challenges persist. The vast backlog of legacy applications, and the continuing addition to the legacy backlog suggest that it will be far more than 25 years before we run out of work. Lucky for us, it also means we’re just getting started in bringing meaningful, semantic transformations.

To put things into perspective, we have supported multinational organizations like Amgen, Broadridge, and Morgan Stanley to undergo their semantic transformation, through the adoption of taxonomies, ontologies, and semantic knowledge graphs. We’ve developed and continuously evolved gist, a foundational ontology built for practical enterprise use.

We are proud to faithfully serve organizations undergoing their data-centric transformations, in the same fashion that sherpas support and guide high-altitude climbers in the mountaineering world.

As a matter of fact, we’d like to extend an invitation for you, our dear reader, to sample our guidance in a 30-minute, no-strings-attached consultation. During this session, we’ll share how to avoid common pitfalls and reduce ongoing project risks. We guarantee it will improve your chances of launching successful pilot projects using taxonomies, ontologies, and knowledge graphs.

If you are interested in having a friendly chat, email us at [email protected] with a summary of your goals.

We’ll set up a time that works for you.

FOR MORE INFORMATION:

Contact JT Metcalf, Chief Administrative Officer at [email protected] or

call us at (970) 490-2224

Building an Ontology with LLMs

October 16, 2025June 30, 2025 by Dave McComb

Building an Ontology with LLMs

We implement Enterprise Knowledge Graphs for our clients. One of the key skills in doing so is ontology modeling. One might think that with the onslaught of ChatGPT and the resulting death knell of professional services, we’d be worried. We’re not. We are using LLMs in our practice, and we are finding ways to leverage them in what we do but using them to design ontologies is not one of the use cases we’re leaning on.

A Financial Reporting Ontology

Last week Charlie Hoffman, who is an accomplished accountant and CPA, showed me the financial reporting ontology he had built with the help of an LLM. As so many of us are these days, he was surprised at the credible job it had done in so little time. It loaded into Protégé, the reasoner ran successfully (there weren’t any real restrictions so that isn’t too hard to pull off). It created a companion SHACL file. In the prompt, he asked it to base it on gist, our upper ontology, and sure enough, there was a gist namespace (an old one, but still it was a correct one) with the requisite gist: prefix. It built a bunch of reasonable-sounding classes and properties in the gist namespace (technically, namespace squatting, but we haven’t gotten very far on ethical AI yet).

Now I look at this and think, while it is a clever trick, it would not have helped me build a financial reporting ontology at all (a task I have been working on in my spare time, so I would have welcomed the help if there was any). I would have tossed out every line. There wasn’t a single line in the file I would have kept.

One Click Ontology Building

But here’s where it gets interesting. A few months ago, at the KM World AI Conference, one of my fellow panelists, Dave Hannibal of Squirro, stated confidently that within a year there would be a one-click ontology builder. As I reflect on it, he was probably right. And I think there is a market for that. I overheard attendees saying, “even if the quality isn’t very good, it’s a starting point, and we need an ontology to get started.”

An old partner and mentor once told me, “Most people are better editors than authors.” What he meant was: give someone a blank sheet of paper and they struggle to get started, but give them a first draft and they tear into it.

The Zeitgeist

I think the emerging consensus out there is roughly as follows:

GraphRAG is vastly superior to prompt engineering or traditional RAG (it’s kind of hard for me to call something “traditional” that’s only a year old), in terms of reigning in LLM errors and hallucinations.
In order to do graphRAG you need a Knowledge Graph, preferably a curated Enterprise Knowledge Graph.
A proper Enterprise Knowledge Graph has an Ontology at its core.
Ontology modeling skills are in short supply and therefore are a bit of a bottleneck to this whole operation.
Therefore, getting an LLM to create even a lousy ontology is a good starting point.

This seems to me to be the zeitgeist as it now exists. But I think the reasoning is flawed and it will lead most of its followers down the wrong path.

The flawed implicit assumption

You see, lurking behind the above train of thought is an assumption. That assumption is that we need to build a lot of ontologies. Every project needs an ontology.

There are already tens of thousands of open-source ontologies “out there” and unknowable multiples of that on internal enterprise projects. The zeitgeist seems to suggest that with the explosion of LLM powered projects we are going to need orders of magnitude more ontologies. Hundreds of thousands, maybe millions. And our only hope is automation.

The Coming Ontology Implosion

What we need are orders of magnitude fewer ontologies. You really see the superpowers of ontologies when you have the simplest possible expression of complex concepts in an enterprise. Small is beautiful. Simpler is better. Fewer is liberating.

I have nearly 1000 ontologies on our shared drive that I’ve scavenged over the years (kind of a hobby of mine). Other than gist, I’d say there are barely a handful that I would rate as “good.” Most range from distracting to actively getting in the way of getting something done. And this is the training set that LLMs went to ontology school on.

Now I don’t think the world has all the ontologies it needs yet. However, when the dust settles, we’ll be in a much better place the fewer and simpler the remaining ontologies are. Because what we’re trying to do is negotiate the meaning of our information, between ourselves and between our systems. Automating the generation of ontologies is going to slow progress down.

How Many Ontologies Do We Need?

Our work with a number of very large as well as medium-sized firms has convinced me that, at least for the next five years, every enterprise will need an Enterprise Ontology. As in 1. This enterprise ontology that some of our clients call their “core ontology” is extended into their specific sub-domains.

But let’s look at some important numbers.

gist, our starter kit (which is free and freely available on our web site) has about 100 classes and almost that many properties, for a cognitive load of 200 concepts.
When we build enterprise ontologies, we often move many distinctions into taxonomies. What this does is shift a big part of the complexity of business information out of the structure (in the ontology and the shapes derived from the ontology) and into a much simpler structure that can be maintained by subject matter experts and has very little chance of disrupting anything that is based on the ontology. It is not unusual to have many thousands of distinctions in taxonomies, but this complexity does not leak into the structure or complexity of the model.
When we work with clients to build their core ontology, we often double or triple the number of concepts that we started with in gist, to 400-600 total concepts. This gets the breadth and depth needed to provide what we call the scaffolding to include all the key concepts in their various lines of businesses and functions.
Each department often extends this further, but it continues to astound us how little extension is often needed to cover the requisite variety. We have yet to find a firm that really needs more than about 1000 concepts (classes and properties) to express the variety of information they are managing.
A well-designed Enterprise Ontology (a core and a series of well-managed extensions) will have far fewer concepts to master than even an average-sized enterprise application database schema. Orders of magnitude fewer concepts than a large packaged application, and many, many orders of magnitude fewer than the sum total of all the schemas that have been implemented.

We’re already seeing signs of a potential further decrease. Most of the firms in the same industry share about 70-80% of their core concepts. Industry ontologies will emerge. I really mean useful ones; there are many industry ontologies out there, but we haven’t found any useful ones yet. As they emerge, and as firms move to specializing their shared industry ontology, they will need even fewer new unique concepts.

What we need are a few thousand well-crafted concepts that information providers and consumers can agree on and leverage. We currently have millions of concepts in the
many ontologies that are out there, and billions of concepts in the many database schemas that are out there.

We need a drastic reduction in quantity and a ramp up in quality if we are to have any hope of reigning in the complexity we have created. LLMs used for ontology building promise a major distraction to that goal. Let’s use LLMs instead for things they are good at, like extracting information from text, finding complex patterns in noise, and generating collateral content at wicked rates to improve the marketing department’s vanity metrics.

The year of the Knowledge Graph (2025)

October 16, 2025June 18, 2025 by Dave McComb

The year of the Knowledge Graph (2025)

There are a lot of signals converging on this being the year of the Knowledge Graph.

Before we get too carried away with this prognosis, let’s review some of the previous candidates for year of the Knowledge Graph, and see why they didn’t work out.

2001

Clearly the first year of the Knowledge Graph was 2001, marked by the unveiling of the Semantic Web by Tim Berners-Lee, James Hendler and Ora Lassila in Scientific American¹. This seemed like it was the year of the Knowledge Graph (even though the term “Knowledge Graph” wouldn’t come into widespread use for over a decade). They were talking about the same technology, even the exact same standards.

What made it especially seem like it was the year of the Knowledge graph was that it was only ten years earlier that Tim Berners-Lee had unleased the World Wide Web, and it seemed like lightning was going to strike twice. It didn’t. Not much happened publicly for the next decade. Many companies were toiling in stealth, but there were no real breakthroughs.

2010

Another breakthrough year was 2010, with the launching of DBPedia as the hub of the Linked Open Data movement. DBPedia came out of the Free University of Berlin, where they had discovered that the info boxes in Wikipedia could be scraped and turned into triples with very little extra work. By this point the infrastructure had caught up to the dream a bit, there were several commercial triple stores, including Virtuoso which hosted DBPedia.

The Linked Open Data movement grew to thousands of RDF linked datasets, many of them publicly available. But still it failed to reach escape velocity.

2012

Another good candidate is 2012 with the launch of the Google Knowledge Graph. Google purchased what was essentially a Linked Open Data reseller (MetaWeb) and morphed it into what they called the Google Knowledge Graph, inventing and cementing the name at the same time. Starting in 2012 Google began the shift from providing you with pages on the web where you might find the answers to your questions, to directly answering them from their graph.

Microsoft followed suit almost immediately picking up a Metaweb competitor, Powerset, and using it as the underpinning of Bing.

Around this same time, June of 2009 Siri was unveiled at our Semantic Technology Conference. This was about a year before Apple acquired Siri, Inc., the RDF based spin off from SRI international and morphed it into their digital assistant of the same name.

By the late 20teens, most of the digital native firms were graph based. Facebook is a graph, and in the early days had an API where you could download RDF. Cambridge Analytics abused that feature, and it got shut down, but Facebook remains fundamentally a graph. LinkedIn adopted an RDF graph and morphed it to their own specific needs (two hop and three hop optimizations) in what they call “Liquid.” AirBnB relaunched in 2019 on the back of a Knowledge Graph to become an end-to-end travel platform. Netflix calls their Knowledge Graph StudioEdge.

One would think about Google’s publicity and the fact that they were managing hundreds of billions of triples, and with virtually all the digital natives on board, the enterprises would soon follow. But they weren’t. A few did to be sure, but most did not.

2025

I’ve been around long enough to know that it’s easy to get worked up every year thinking that this might be the big year, but there are a lot of dominos lining up to suggest that we might finally be arriving. Let’s go through a few (and let me know if I’ve missed any).

It was tempting to think that enterprises might follow the FAANG lead (Facebook, Amazon, Apple, Netflix, and Google) as they have done with some other technologies, but in this case they have not yet followed. Nevertheless, some intermediaries, those that tend to influence Enterprises more directly seem to be on the bandwagon now.

Service Now

A few years ago, Service Now rebranded their annual event as “Knowledge 202x²” and this year acquired Moveworks and Data.World. Gaurav Rewari, an SVP and GM said at the time: “As I like to say, this path to agentic ‘AI heaven’ goes through some form of data hell, and that’s the grim reality.”

SAP

As SAP correctly pointed out in the October 2024 announcement³of the SAP Knowledge Graph As they said in the announcement, “The concept of a knowledge graph is not new…” Earlier versions of HANA supported openCyber as their query language, the 2025 version brings RDF and OWL to the forefront, and therefore top of mind for many enterprise customers.

Samsung

Samsung recently acquired the RDF triple store vendor RDFox⁴. Their new “Now Brief” (a personal assistant which integrates all the apps on your phone via the in-device knowledge

²https://www.servicenow.com/events/knowledge.html

³https://ignitesap.com/sap-knowledge-graph/

⁴https://news.samsung.com/global/samsung-electronics-announces-acquisition-of-oxford-semantic technologies-uk-based-knowledge-graph-startup

graph) is sure to turn some heads. In parallel this acquisition has launched Samsung’s Enterprise Knowledge Graph project to remake the parent company’s data landscape.

AWS and Amazon

Around 2018 Amazon “acqui-hired” Blazegraph, an open-source RDF graph database, and made it the basis of their Neptune AWS graph (offering the option of RDF graph or Labeled Property Graph, and working on a grand unification of the two graph types under the banner of “OneGraph”).

As significant as offering a graph database as a product, is their own internal “dogfooding.” Every movement of every package that Amazon (the eCommerce side) ships is tracked by the Amazon Inventory Graph.

graphRAG

Last year everyone was into “Prompt Engineering” (no, software developers did not become any more punctual, it was a job for a few months to learn how to set up the right prompts for LLMs). Prompt Engineering gave way to RAG (Retrieval-Augmented Generation) which extended prompting to include additional data that could be used to supplement and LLMs response.

A year in and RAG was still not very good at inhibiting LLMs hallucinatory inclinations. Enter graphRAG. The underlying limitation of RAG is that most of the data that could be queried to supplement a prompt, in the enterprise, is ambiguous. There are just too many sources, too many conflicting versions of the truth. Faced with ambiguity, LLMs hallucinate.

GraphRAG starts from the assumption (only valid in a handful of companies) that there is a grounded set of truth that has been harmonized and curated in the enterprise knowledge graph. If this exists it is the perfect place to supply vetted information to the LLM. If the enterprise knowledge graph doesn’t exist, this is an excellent reason to create one.

CIO Magazine

CIO.com magazine proclaims that Knowledge Graphs are the missing link in Enterprise AI⁵To quote from this article: “To gain competitive advantage from gen AI, enterprises need to be able to add their own expertise to off-the-shelf systems. Yet standard enterprise data stores aren’t a good fit to train large language models.”

CIO Magazine has a wide following and is likely to influence many decision makers.

Gartner

Gartner have nudged Knowledge Graph into the “Slope of Enlightenment”⁶

⁵https://www.cio.com/article/3808569/knowledge-graphs-the-missing-link-in-enterprise-ai.html ⁶https://www.gartner.com/en/articles/hype-cycle-for-artificial-intelligence

Summary

Those of you who know me know I’m mostly an anti-hype kind of guy. We, at Semantic Arts, don’t benefit from hype, as many software firms do. Indeed, hype generally attracts lower quality competitors and generates noise. These are generally more trouble than they are worth.

But sometimes the evidence is too great. The influencers are in their blocks, and the race is about to begin. And if I were a betting man, I’d say this is going to be the year that a lot of enterprises wake up and say, “we’ve got to have an Enterprise Knowledge Graph (whatever that means).”