Virtuoso Sponger



Extracting RDF Structured Data from
Non-RDF Sources


Growing the Semantic Web

Virtuoso Sponger

Inputs: Supported Data Sources

Output: Structured Data


In the context of the Semantic Data Web:

"Data organized into semantic chunks or entities, with similar entities grouped together into relations or classes"

  Michael Bergman (http://www.mkbergman.com)
  Article: "More Structure, More Terminology and (hopefully) More Clarity"

Sponger Benefits

Sponger Inputs & Outputs

Sponger Inputs & Outputs


Sponger Architecture

Using The Sponger

Can be invoked in several ways, via:

Using the Sponger:
SPARQL Query Processor

SPARQL Extensions:
IRI Dereferencing of FROM Clauses

Enabled through 'define get:...' pragmas

DEFINE get:method "GET"
DEFINE get:soft "soft"
SELECT ?id
FROM NAMED <http://myhost/user1.ttl>
FROM NAMED http://myhost/user2.ttl
WHERE { GRAPH ?g { ?id a ?o } };

SPARQL Extensions:
IRI Dereferencing of Variables

Enabled through 'define input:grab-...' pragmas

DEFINE input:grab-var "?more"
DEFINE input:grab-depth 10
DEFINE input:grab-limit 100
DEFINE input:grab-base "http://myhost/"
SELECT ?id ?fullname ?email
WHERE { GRAPH ?g {
?id a <Person> ; <FullName> ?fullname ; <Email> ?email .
OPTIONAL { ?id <SeeAlso> ?more }
} } ;

Using the Sponger:
RDF Proxy Service

RDF Proxy Service

Parameters:

Using the Sponger:
OpenLink RDF Client Applications

Bundled as part of OpenLink AJAX Toolkit (OAT)

RDF Browser

iSPARQL - Interactive SPARQL query builder

Using the Sponger:
ODS-Briefcase (Virtuoso WebDAV)

Briefcase = A component of OpenLink Data Spaces

Includes high level interface to Virtuoso WebDAV repository

SIOC as a Data Space "Glue" Ontology

Using the Sponger:
Directly via Virtuoso PL


Sponger Cartridges

Sponger Cartridges


Sponger Architecture

Sponger Cartridge Invocation

Sponger Cartridge Invocation


Sponger Configuration using Conductor UI

Sponger Configuration using Conductor UI: RDF Cartridges Pane

RDF Cartridges Pane


Sponger Configuration using Conductor UI: GRDDL Filters

GRDDL Filters


Sponger Configuration using Conductor UI: XSLT Templates

XSLT Templates


Sponger Configuration using Conductor UI: Schema Files / Supported Ontologies

Schema Files / Supported Ontologies


Custom Cartridges

Custom Cartridges

Cartridge Hook - Virtuoso PL Prototype

in graph_iri varchar: IRI of graph being retrieved

in new_origin_uri varchar: URI of the document being retrieved

in destination varchar: destination graph IRI

inout content any: the document content

inout async_queue any: preallocated asynchronous queue used to call the configured ping service

inout ping_service any: URL of the ping service, as assigned to the PingService parameter in the [SPARQL] section of the virtuoso.ini file. This argument could be used to notify the PingTheSemanticWeb RDF document repository & notification service

inout api_key any: unique string providing cartridge specific data taken from the RC_KEY column of the DB.DBA.SYS_RDF_CARTRIDGES table

Flickr Cartridge Extracts

procedure DB.DBA.RDF_LOAD_FLICKR_IMG (
in graph_iri varchar, in new_origin_uri varchar, in dest varchar,
inout _ret_body any, inout aq any, inout ps any, inout _key any)
{
declare xd, xt, url, tmp, api_key, img_id, hdr, exif any;
...
url := sprintf ('http://api.flickr.com/services/rest/?method=
  flickr.photos.getInfo&photo_id=%s&api_key=%s', img_id, api_key);
tmp := http_get (url, hdr);
...
xd := xtree_doc (tmp);
...
xt := xslt (registry_get ('_rdf_cartridges_path_') || 'xslt/flickr2rdf.xsl', xd, vector ('baseUri', coalesce (dest, graph_iri), 'exif', exif));
xd := serialize_to_UTF8_xml (xt);
DB.DBA.RDF_LOAD_RDFXML (xd, new_origin_uri, coalesce (dest, graph_iri));
return 1;
}

Custom Resolvers

http://demo.openlinksw.com/sparql?default-graph-uri= urn:lsid:ubio.org:namebank:11815
&should-sponge=soft&query=SELECT+
*+WHERE+{?s+?p+?o}&format=text/html

Proxy service also recognizes URNs

http://demo.openlinksw.com/proxy?url=
urn:lsid:ubio.org:namebank:11815&force=rdf