PCCW Global, a subsidiary of Hong Kong Telecom, is a communication service provider with global coverage. Semantic Arts worked with PCCW Global to improve visibility into assets available for customer service to optimize the value of billions of dollars of capital investment and improve the PCCW Global customer experience.
Prior to the engagement, the relevant data was stored in multiple application siloes. Each application used its own set of identifiers. Creating the views needed by Sales was a time-consuming manual process.
PCCW Global translated the needs of Sales into a short list of competency questions, i.e. questions the knowledge graph would need to answer. We then used the competency questions to establish an extension to the Semantic Arts gist upper ontology. With the target view established, we then worked with SMEs to identify which data elements from which applications would be required to create the knowledge graph.
Data was extracted from the sources in two ways. We used the tarql program to convert comma-separated values into semantic triples of the form subject – predicate – object. For data that was available in json format, PCCW Global provided a custom program to convert the data into rdf triples. Subsequently, we refined the code used on this project to create a generic program to convert json data to triples guided by a configuration file that identifies how to handle special cases such as data about one class embedded in the data for a different class.
To simplify the transformation of data from data silos to a single knowledge graph, we first consolidated the data from all the sources and then performed the transformation to triples. In effect, rather than a conventional ETL (Extract-Transform-Load) approach, we used ELT: Extract-Load-Transform. The data was loaded into the knowledge graph as “naïve” triples that used existing relational column names as predicates. The logic for creating a single node with a unique, permanent ID for each network object could then be executed on the consolidated data from the multiple sources.
After completing the two-step data ingestion, we had a concise set of triples with no duplication of data and we were able to answer all of the competency questions. Along the way, we identified potential applications of mathematical graph theory that could refine the view of usable spare network capacity given the existing configuration of connections between network elements and existing service.
Having completed the analysis, design, and implementation, key questions about assets available to provide service to PCCW Global customers can be answered directly from the knowledge graph, and cumbersome manual piecing data together from multiple sources is no longer required.