euBusinessGraph – connect the dots

Corporate information, including basic company firmographics (e.g., name(s), incorporation data, registered addresses, ownership and related entities), financials (e.g., balance sheets, ratings) as well as contextual data (e.g., cadastral data on corporate properties, geo data, data about directors and shareholders, public tenders data, press mentions) are the foundation that many data value chains are built on. Furthermore, this type of information contributes to the transparency and accountability of enterprises, is instrumental input to the process of marketing and sales, and plays a key role in many business interactions.

Existing initiatives to increase the interoperability and access of corporate data are mostly fragmented (across borders), limited in scope and size, and silo-ed within specific business communities with limited accessibility from outside their originating sectors and countries. As a result, collecting and aggregating data about a business entity from several public sources (be it private/public, official or non-official ones), and especially across country borders and languages is a tedious, time consuming, error prone, and very expensive operation which renders many potential business models non-feasible.

euBusinessGraph represents a key initiative to simplify and disrupt the cross-border and cross-lingual collection, reconciliation, aggregation, and provisioning and analytics of company-related data from authoritative and non-authoritative public or private sector sources, with the aim of enabling cross-sectorial innovation. By a combination of large companies, SMEs, public organizations, and technology transfer providers euBusinessGraph sets the foundations of a European cross-border and cross-lingual “business graph”, aggregating, linking, and provisioning (open and non-open) high-quality company-related data, demonstrating innovation across sectors where company-related data value chains are relevant.

Concept and approach
The basic idea behind the euBusinessGraph project is two-fold.

Firstly, the creation of a “business graph” – a highly interconnected graph of Europe-wide company-related information both from authoritative and non-authoritative sources (including data both from the public and private sector).

Secondly, providing the reliable provisioning of the business graph in the form of a data marketplace that in turn will enable the creation of a set of data-driven products and services via a set of corresponding business cases. The data insights generated by the proposed products and services will in turn feed the generated data insights back into the business graph, ensuring that information collected and aggregated within the graph is sustainable and of high value.

The business cases proposed in euBusinessGraph are innovative 1) in regard to cross-sector and cross-lingual aspects of data value chains and 2) with respect to how they can build upon and benefit from company-related data made available via a data marketplace. The usage of Linked Data as a foundation for this will also imply a technology transfer of Linked Data technologies from the technology partners to the business case partners.

Although the business cases are rather different (in that they aim to deliver value in different sectors), they all require heterogeneous data sources to support them and share the common demand for a set of services and tools that are not currently available in an integrated and simple manner.

These business cases will utilise company-related data from different sectors and will be enabled by the availability of a data provisioning infrastructure that will ensure that the data is easily discoverable, integrated, and provisions from a single point of entry. In addition, the business cases will benefit from a common library of resources, an open API and semantic enrichment, which opens potentially far-reaching exploitation opportunities.

Overview of the key components
An overview of the key components of the project is provided in the figure below, depicting the connection between data consumers and providers of the business graph data, through the enhanced environment developed in euBusinessGraph. In the process of transforming, integrating, publishing, and reusing data, two types of data value chains are created as depicted in the picture, and discussed below.

Figure 1: The euBusinessGraph concept

The Data Sources component (the left part of the figure) is represented by the various datasets and data sources that will be integrated and used in populating the business graph.

• National Business Registries: Data from authoritative business registries providing official information about companies at the national level (e.g., identifiers of companies, addresses, industry codes, etc.)

• Gazettes: Public records of company-related legal notices.

• Other registries: Data from registries created in various sector such as credit bureau registries, charities, licenses, stock markets, patent registries, etc.

• OpenCorporates: Open data about over 92 million companies, primarily collected from public sources.

• Newspapers/broadcasters: Unstructured company-related data from media sources in various languages.

• Social media and the Web: Unstructured or semi-structured company-related data form companies’ websites and social media.

• Public Administration Tenders: data about companies participating in public tenders.

The Business Graph component (the second from the left in the figure) is a core element of euBusinessGraph, which addresses two main aspects:

1. Design of the business graph, including creation/maintenance of a system of identifiers for company-related data, development of shared data models, multi/cross-lingual aspect;

2. Reliable provisioning of the business graph, including support for data transformations, onboarding, hosting, access, analytics, data marketplace, and operations.

The Business products and services component (the second from the right in the figure) will access and use the data made available in the business graph for creation of data-driven products and services.

The Customer segments component (the first from the right in the figure) is represented by customer segments of the products and services that are created at the business products and services layer. This focus here is on the value created to the end users of the business products and services. The core customers segments targeted by the euBusinessGraph products and services are depicted in the figure.

The Data Value Chains enabled by euBusinessGraph are twofold:

• Data value chain #1 – from silo-ed data sources to customer segments (at the bottom of the figure): the value chain spans from the silo-ed data sources to various customer segments. The value is created through the set of capabilities provided by the business graph provisioning infrastructure, which are in turn leveraged to establish a set of innovative business products and services.

• Data value chain #2 – data value feedback chain (at the top of the figure): consists of the data insights generated by the proposed products and services being fed back into the business graph, therefore enhancing the value and scope of the data in the business graph. This is euBusinessGraph’s mechanism to ensure that the business graph can host highly valuable and high-quality information, while at the same time increasing the chances of a self-sustainable business graph.