gist: Buckets, Buckets Everywhere:  Who Knows What to Think

gist: Buckets, Buckets Everywhere:  Who Knows What to Think

We humans are categorizing machines, which is to say, we like to create metaphorical buckets and put things inside. But there are different kinds of buckets, and different ways to model them in  OWL and gist. The most common bucket represents a kind of thing, such as Person or Building.  Things that go into those buckets are individuals of those kinds, e.g. Albert Einstein, or the particular office building you work in. We represent this kind of bucket as an owl:Class and we use rdf:type to put something into the bucket. 

Another kind of bucket is when you have a group of things, like a jury or a deck of cards that are functionally connected in some way. Those related things go into the bucket (12 members of a jury, or 52 cards). We have a special class in gist called Collection, for this kind of bucket. A specific bucket of this sort will be an instance of a subclass of gist:Collection. E.g. OJs_Jury is an instance of the class Jury, a subclass of gist: Collection. We use gist:memberOf to put things into the bucket.  Convince yourself that these buckets do not represent a kind of thing. A jury is a kind of thing, a particular jury is not. We would use rdf:type to connect OJ’s jury to the owl: ClassJury, and use gist:memberOf to connect the specific jurors to OJ’s jury.

A third kind of bucket is a tag which represents a topic and is used to categorize individual items for the purpose of indexing a body of content. For example, the tag “Winter” might be used to index photographs, books and/or YouTube videos. Any content item that depicts or relates to winter in some way should be categorized using this tag. In gist, we represent this in a way that is  structurally the same as how we represent buckets that are collections of functionally connected  items. The differences are 1) the bucket is an instance of a subclass of gist:Category, rather than of gist: Collection and 2) we put things into the bucket using gist:categorizedBy rather than gist:memberOf. The Winter tag is essentially a bucket containing all the things that have been indexed or categorized using that tag.

Below is a summary table showing these different kinds of buckets, and how we represent them in  OWL and gist.

Kind of Bucket Example Representing the Bucket Putting something in the Bucket
Individual of a Kind John Doe is a Person Instance of owl:Class rdf:type
A bucket with  functionally connected  things insideSheila Woods is a  member of OJ’s JuryInstance of a subclass of  gist:Collection gist:memberOf
An index term for  categorizing contentThe book “Winter of  our Discontent” has  Winter as one of its  tagsInstance of a subclass of  gist:Category gist:categorizedBy


Morgan Stanley: Data-Centric Journey 

Morgan Stanley: Data-Centric Journey 

Morgan Stanley has been on the semantic/ data-centric journey with us for about 6 years.  Their approach is the adoption of an RDF graph and the development of a semantic knowledge base to help answer domain-specific questions, formulate classification recommendations and deliver quality search to their internal users. Their primary objective is to enable the firm to retrieve, retain and protect information (i.e., where the information resides, how long it must be maintained and what controls apply to it). 

The knowledge graph is being developed by the Information Management team under the direction of Nic Seyot (Managing Director and Head of Data & Analytics for Non-Financial  Risk). Nic is responsible for the development of the firm-wide ontology for trading surveillance, compliance, global financial crime and operational risk. Nic’s team is also helping other departments across the firm discover and embrace semantic data modeling for their own use cases.  

Morgan Stanley has tens of thousands of discrete repositories of information. There are many different groups with specialized knowledge about the primary objectives as well as many technical environments to deal with. Their motivating principle is to understand the  conceptual meaning of the information across these various departments and  environments so that they can answer compliance and risk questions.  

A good example is a query from a user about the location of sensitive information (with many conflicting classifications) and whether they are allowed to share it outside of the firm. The answer to this type of question involves knowledge of business continuity,  disaster recovery, emergency planning and many other areas of control. Their approach is to leverage semantic modeling, ontologies and knowledge graph to be able to comprehensively answer that question.  

To build the knowledge graph around these information repositories, they hired Semantic  Arts to create a core ontology around issues that are relevant to the entire firm – including personnel, geography, legal entities, records management, organization and a number of firm-wide taxonomies. Morgan Stanley is committed to open standards and W3C principles which they have combined with their internal standards around quality governance. They created a Semantic Modeling and Ontology Consortium to help govern and maintain that core ontology. Many divisions within the firm have joined the advisory board for the consortium and it is viewed as an excellent way of facilitating cooperation between divisions.

The adoption-based principle has been a success. They have standardized ETL and  virtualization to get information structured and into their knowledge graph. The key use  case is enterprise search to give departments the ability to search for their content by leveraging the tags, lists, categories and taxonomies they use as facets for content search.  One of the key benefits is an understanding of the network of concepts and terms as well as how they relate to one another within their organization. 

Semantic Arts ontologists helped engineer the network of concepts that are included into their semantic thesaurus as well as how they interconnect within the firm. They started out with over 6,500 policies and procedures as a curated corpus of knowledge of the firm.  They used natural language to extract the complexity of relationships out of their combined taxonomies (over half a million concepts). We worked with them to demonstrate the power of conceptual simplification. We helped them transform these complex relationships into broader, narrower and related properties which enable the users to ask business questions in their own context (and acronyms) to enhance the quality of search without manual curation. Our efforts helped reduce the noise, merge concepts with similar meaning and identify critical topics to support complex queries from users.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

Chemical Manufacturer: Faceted Taxonomies 

Chemical Manufacturer: Faceted Taxonomies 

Capturing interrelations of information for relevance can be difficult, even with NLP. More often companies will seek to work in taxonomy space in their journey toward richer implementations of knowledge graphs for automation adoption. Our consulting services leveraged this approach to provide a foundation stepping stone as the company sought to bring inherent knowledge graph capabilities into their business. 

This global manufacturer had a sluggish system in place to comb through internet publications and look for key terms that might mark articles of interest to its divisions for competitive intelligence as a spawning point for innovative ideas. However, processes remained heavily manual and cumbersome. They realized that strong text matching and analysis was a missing component and decided to turn to taxonomies to mitigate and improve the process. 

Semantic Arts quickly discovered that the key to success was faceted taxonomies. We  worked with SMEs to determine what areas contained specific controlled vocabularies and  specialized terminology. As a starting point, Semantic Arts created a series of taxonomies  for each area for improved automation. Areas included: Products, Industries, Customers,  Capabilities, Manufacturers, Materials and Processes 

The tight focus of each facet allowed for SMEs and division experts to create very specific lists of terms. By using preferred labels and alternate labels (synonyms) for each, SA  enabled what could be recognized and matched in a desired internet corpus. Initial  implementation of the facets showed a higher level of matching to recognized terms of  interest than an NLP algorithm achieved, created a higher confidence in the significance of  the match, and left out many common or “stop” terms that the original method still picked  up. A start of efficiency was realized. 

Semantic Arts developed a more extended road map with the manufacturer to first refine  and bulk up the taxonomy lists based on continued implementation and analysis. By implementing, the client’s intent will be to apply a simple semantic layer to relate and interconnect the taxonomy facets. This ontology model will allow even richer inferencing  and matching of results based on relationships between terms (i.e., an article about a  specific product will imply the involvement of certain manufacturers even if they are not  explicitly mentioned). 

In this case, a small step in a focused step into taxonomy re-classification is helping to open more understanding about the broader benefit while allowing for faster delivery of more pinpointed research answers. In addition, building pipelines of connected unstructured information is consistent with organizational goals of harmonizing data for greater  strategic value. Divisions in other parts of the enterprise have taken notice and there is 

expressed interest in leveraging the unique re-usability and interoperability semantic capabilities enable after this initial pilot.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

Investment Bank Case Study: Operational Risk 

Investment Bank Case Study: Operational Risk 

In this major investment bank managing all the flavors of operational risk has become very balkanized. There are separate systems for process management, risk identification,  controls, vendor risks, cyber risks, outsourced risks, fraud, internal incidents, external incidents, business continuity, disaster recovery inter-affiliate risk and many more. 

To address, we were able to create an elegant ontology that captured all these aspects of  risk. We then (one-by-one) were able to extract and conform their existing information into this shared model. 

We managed to catch the re-write of a control library in mid-stream and get them to persist the key information directly to a triple store. The mappings have been ported into production, and we built (in TARQL) the capability to create a unified view of information systems that feed risk evaluation metrics. Additionally, an interactive graphics capability has been built directly on the triplestore for visualization across the risk portfolio.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

Investment Bank: Resolution Planning 

Investment Bank: Resolution Planning 

This is one of the “too big to fail” banks, who are required by regulators to implement  “resolution planning” or as it’s known on the street a “living will.” The first few generations of the resolution plan were long on long textual descriptions of the nature of the interactions between various legal entities within the bank. 

Our sponsor recognized that the key to making a resolution plan workable is to make it data driven rather than document driven. Document-driven resolution plans are out of date as soon as they are written and require humans to read and interpret. While the firm,  as with most large financial services firms, consists of thousands of legal entities, there are  “only” a few dozen that are significant from a resolution standpoint. However, this is made more complex because hundreds of departments (may and do) have service relationships with their peers in other countries and time zones. Often these arrangements are tacit rather than spelled out, and even those that are written fall far short of the regulators desire to see specific mechanisms for controlling the work and assuring it gets completed. 

We based this project on the concept of Inter-affiliate Service Level Agreements. We designed an ontology of Service Level Agreements and in the course of four months iterated it through eight versions as we learned more and more about the specifics of getting a new system designed and built. 

In addition to (and in parallel with) the ontology development we built an operating system,  using our model driven development environment. We populated a triple store with data  sourced from many of their existing systems (HR for personnel and departments, finance for legal entities and jurisdictions, IT for applications, hosting and data centers and the activity taxonomy from the project we had performed the previous year). On top of this we built user interfaces that allowed managers to document the agreements that were in place between themselves and other departments in other legal entities. 

We completed the project in time to demo to the regulators and it is now being used as the basis for their go forward Resolution Plan.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

LexisNexis: Enterprise Ontology

LexisNexis: Enterprise Ontology 

We worked with this leading provider of legal and medical knowledge to build an enterprise  ontology for their wide-ranging content. In addition to building an ontology for their case law and statutory product lines, we worked with their Master Data Management Initiatives.  

They have over 30 MDMs in various stages of development with logical data models. These models (and therefore the MDMs themselves) were integrated manually, in a somewhat ad-hoc fashion. We built tooling to convert their existing logical models into a single integrated ontology, where the integration points were far more obvious. From there, we built tooling to convert the ontology back to a set of similar, but now conformed, logical data models.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

Management Consulting: Enterprise Search 

Management Consulting: Enterprise Search 

This major consulting firm has the enviable problem of having every possible desirable expertise characteristic’s somewhere within their ranks of 300,000 employees, and the unenviable problem of trying to find those needles in such a gigantic haystack. 

It’s not that they are unaware of the problem. They have launched many projects over the years to address this, some of which cost hundreds of millions of dollars (and a problem easily worth this much to solve, but very small increases in chargeability or win rates on proposals as a result are worth that on an annual basis). 

We built an ontology to integrate projects and proposals around expertise and proficiency.  We have harvested as much as is known about current employees in terms of skills and proficiency (and we are beginning to get subcontractors and partners), but we know that this information is not being kept up to date. We are at the early stages of two more initiatives, one that will nudge people to update their profiles when it becomes known that there is demand in a particular area; the other, to combine externally available information with this primarily internally sourced graph. 

The other side of this project is to replace the game of telephone that is currently the primary way to find key people in the firm. Currently, senior staff or partners rely on their network to find experts. Junior people are more often left out. In either case the process is quite haphazard as each request gets forwarded on to another subset of the network.  There is a great deal of reluctance to “spam” their internal network, but there is also the need to find the right people as rapidly as possible. 

We have built an early prototype model with the vision that a chat-based service will leverage the graph network as well as keep track of the results, such that over time the requests will get smarter, smaller, and resolve faster.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

Chemicals Company: Ontology Development

With a successful 200-year track record in developing extensive and diverse product lines by leveraging chemistry and science, this global innovator was embarking on a new era of discovery to shape a better world. 

Categorization of products and the relationships between those named entities were difficult to describe in traditional taxonomies and “systems of systems”. Over the course of  2 centuries massive amounts of complex information had been arrogated, however it didn’t give a full picture. The patterns of data intersection for decision-making were lost within the siloed systems. For greater predictable business insights based not only on structured data but unstructured documents the need for easier access and interoperability was a primary goal. 

A CoE (Center of Excellence) was formed to accelerate R&D development, analytics and  finding of data to advance innovation with greater speed and reliability. A secondary mission was to socialize this capability to the broader community. Ontology development and tools to harmonize information were a foundational part of this data enrichment strategy, but in-house skills and data-centric modeling expertise were insufficient. It was necessary to develop as a core competency. 

Semantic Arts consultants were engaged to bridge this critical ontology and semantic capabilities gap. Our strategic advisory service offering brought over 25 years of practical implementation learnings and educational workshops to deliver a series of focused topics that met broad expectations of a teaching library. Recorded videos are now available on  the company’s enterprise intranet to traverse the complexities of information silos by enabling knowledge graphs, ontologies, and data-centric thinking. 

Topics include – Introducing Semantic Technologies and Ontologies, Introduction to OWL,  Introduction and Hands-on with Protégé and Property Rules, Understanding Class Relationships,  Expressions, and Property Restrictions, Semantic Triple Visualizations, Ontology vs. Taxonomy,  Topic Extraction and Weak Signals, and UI Visualization with Knowledge Graphs 

In parallel Semantic Arts are engaged with a division focused more on enriching and unifying 19 different data repositories into an ontology. The goal is to model relationships between the concepts, substances, and components for describing the products in a disambiguous manner to eliminate data duplication, increase metadata clarity, and eventually incorporate ML and NLP capabilities. We’re collaborating with the client to structure a roadmap for implementation. A projected iterative, agile approach by using our  predictable (rinse and repeat), Think Big / Start Small methodology will be employed to  guide and instruct in this digital evolution.

This effort to interconnect information assets for discovery complements and aligns with the broader digital transformational for bringing “miracles of science” to realization.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

Verizon: Privacy Data

Semantic Arts worked with PwC to improve Verizon’s data privacy capabilities. 

Prior to the engagement, as applications evolved many of the process steps required to stay in compliance with privacy regulations and policies were performed manually using  multiple data sources. Our goal was to provide a knowledge graph as a single source of privacy metadata (information about data classified as private). 

Early in the engagement, we identified the key components of the data privacy landscape,  summarized in this diagram: 

The data set loaded into the knowledge graph identifies which applications and third parties are involved in each kind of privacy data processing: data collection, analysis,  modification, transfer, storage, etc. 

We modeled most of the privacy concerns implied by the diagram, including: legal rights of  a data subject, agreements with third parties, details of data processing relevant to privacy  concerns, data lineage, data retention, data catalogs, impact assessments, privacy-related  processes and tasks 

We also created an extensive taxonomy that allowed key distinctions to be made while the core model remained simple and straight-forward. Finally, we created generic configurable queries to perform a range of data validations. 

The result is a consolidated view of data from multiple sources that allows greatly improved management of application compliance with privacy data policies and regulations.

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224

International Monetary Fund

International Monetary Fund

The IMF works to achieve sustainable growth for its approximately 200 member countries.  It carries out missions and loans funds to execute projects for the member countries. The countries’ financial situation is measured and tracked using a wide variety of economic indicators. 

The challenge: Information is stored in a wide variety of vocabulary managers, applications and databases. This makes it difficult to quickly get answers to questions required to carry out day-to-day work. For example: Who is likely to be an expert on customs; Find documents associated with countries similar to Afghanistan; What types of missions for what countries are addressing climate change? 

In each case, getting the answer requires retrieving and processing information from multiple sources. To represent the information required to answer the questions, we built an ontology that covers the core business of the IMF. The main things are: 

  • Organizations and People 
  • Geographic regions, Countries and Country Groups 
  • Missions that produce Documents for Countries 
  • Documents about Topics authored by Persons 
  • Economic Indicators & Measurements 

We created a knowledge graph composed of the ontology and RDF triples data that was created by converting taxonomies and datasets from a variety of data sources. We wrote  SPARQL queries that traverse the knowledge graph to answer the questions of interest.  This is to be the basis for an internal knowledge portal for integrating structured data (such  as GDP per country) with unstructured data (such as country-specific reports on commodity prices).

Contact Us: 

Overcome integration debt with proven semantic solutions. 

Contact Semantic Arts, the experts in data-centric transformation, today! 

CONTACT US HERE 

Address: Semantic Arts, Inc. 

123 N College Avenue Suite 218 

Fort Collins, CO 80524 

Email: [email protected] 

Phone: (970) 490-2224