Google Knowledge Graph Meta Cartridge

Overview

Given the name of an entity, the GKG meta cartridge (MC) uses the Google Knowledge Graph Search API to search the Google Knowledge Graph for matching entities. Matching entities' descriptions are returned as Linked Data.

The GKG MC relies on the committed output of 'content' MCs which precede it in the Sponger's meta cartridge queue. The content MCs perform NER on the input content, identifying people, places, organizations etc, all expressed as virtrdfmec: 'meta entity' classes (see graph <http://www.openlinksw.com/schemas/virtrdf-meta-entity-class>), and which also act as inference base classes for collecting similar classes emitted by extractor cartridges (e.g. foaf:Person, dbpedia-owl:Person).

The GKG MC gathers named entities identified by NER/NLP content MCs, and performs entity linking against the GKG, using the label of the virtrdfmec entity for the GKG Search API query string.

The GKG meta cartridge is a 'preprocess' MC. The output of the preprocessing MCs is committed before the remaining MCs in the meta cartridge queue are invoked. The latter may make use of the output emitted by the 'content' or 'preprocess' MCs.

API Key

The cartridge requires a Google API key, as describe in Knowledge Graph Search API - Prerequisites. The Knowledge Graph Search API must be enabled in the Google Developers Console and the API key created through the API Manager's Credentials page.

Enter the Google API key in the 'API Key' field of the cartridge's options, displayed by selecting the cartridge from the list under Conductor's Meta Cartridges tab (reachable by navigating via the Conductor UI's links 'Linked Data' > 'Sponger' > 'Meta Cartridges').

Cartridge options

max-entities:

Limits the number of entities returned by the GKG search API. The default value of 1 returns the closest GKG match.

min-score:

Only return matching GKG entities with a resultScore greater than or equal to min-score.

match-entity-type:

Only return entities with a type of http://schema.org/{match-entity-type} e.g. Person, Organization, Country etc. A match-entity-type of '*' returns all matching entities, irrespective of their type.

Output

GKG entities have a corresponding Wikipedia entry. These are exposed directly from the cartridge's topic entity through schema:mentions.

GKG entities returned by the cartridge are exposed via skos:related. These annotations are described in terms of the Fusepool Annotation Model (FAM). Each annotation has only a body, not a target. A target cannot be provided as this is not an NER meta cartridge. It relies on preceding NER/NLP meta cartridges, higher in the meta cartridge chain, to identify named entities from the source content. The locations of the entity mentions in the source content are unknown to this meta cartridge, so an annotation target isn't created. c.f. the Babelfy meta cartridge, which is an NER meta cartridge and which creates an annotation target.

GKG entities are assigned two types, a schema.org type and a mirroring virtrdfmec type. Mirroring schema.org types as virtrdfmec types allows meta cartridges further down the meta cartridge chain to consume the identified named entities. i.e. to ensure they are included in searches of the data source graph by the 'URL' or 'keyword' MCs, in the next stage of the meta cartridge queue. These URL or keyword meta cartridges search the source graph for specific virtrdfmec types on which to perform further processing. e.g. The LinkedIn meta cartridge searches for virtrdfmec:Person, virtrdfmec:Company and virtrdfmec:Organization instances for lookups. The OpenCorporates meta cartridge searches for virtrdfmec:Organization instances.