Virtuoso Open-Source Wiki
Virtuoso Open-Source, OpenLink Data Spaces, and OpenLink Ajax Toolkit
Advanced Search
Help?
Location: / Dashboard / Main / VOSIndex / VOSXML

Virtuoso XML Functionality

Virtuoso can store, index, and retrieve XML data. Virtuoso has XPATH, XSLT, and XQuery all built in, making it second to none among all XML-enabled relational databases. In addition, Virtuoso can provide existing relational data as XML on demand.

XML Data Type

Virtuoso has XML as a first-class data-type. An XML document or fragment is represented as a memory-based tree that can be constructed, copied, modified, and combined with other such trees by an extensive set of functions.

The Virtuoso XML data type has an Oracle-compatible set of methods for XPATH, XSLT, value substitution, parsing, and serialization. Additionally, it has a proprietary set of SQL extensions for searching and iterating over XPATH node sets retrieved from XML columns.

Instances of the XML data type can be stored into long or short text columns or into a dedicated XML LOB column type. The dedicated storage format guarantees well-formedness and optionally XML schema validity, and offers substantial space-savings compared to storing XML as text serialization.

SQLX

SQLX is the SQL standard extension for producing XML data from relational queries. It has functions and aggregates for constructing XML fragments from rows of SELECT result sets, and for aggregating multiple result rows into a single sequence of elements. For example, the following query would produce an Employees XML element, with an Employee child element for each employee --

SELECT XMLElement ("Employees", XMLAgg (XMLElement ("name", CONCAT (LastName, ', ', FirstName)))) FROM Employees;

Each Employee element would have the employee's concatenated last name and first name as a text value.

XPATH

XPATH is the standard way of extracting data from XML trees. Essentially all XML standards have XPATH as an embedded part. XPATH in Virtuoso is offered as a SQL function and two special SQL predicates.

To produce a result set consisting of all job elements from the resumes of all employees:

SELECT job FROM Employees WHERE xcontains (Resume, '//job', job);

XSLT

Virtuoso has a built-in XSLT processor. XSLT is the preferred means of transforming XML data into other XML or non-XML data. This is needed in most XML applications when generating HTML for human consumption, when translating between different XML data storage or transfer formats, when interpreting XML data in internet messages or configuration files.

XQuery

Virtuoso has a built-in XQuery interpreter. This can perform the same functions as XSLT and additionally can query SQL-to-XML mapping schemas. The XQuery interpreter can be invoked from SQL for processing XML data in a language best adapted for this. However, XQuery is not used for efficient filtering of stored XML data, XPATH, a subset of XQuery, must be used for this with the xcontains predicate.

XML Mapping Schemas

Virtuoso offers a mechanism for declaring that a set of JOINs between relational tables represents an XML tree. This tree is not stored as an XML tree, but is rather generated on demand by performing the declared joins and any extra filtering that may be needed.

An XPATH query may be applied to an XML mapping schema. This constructs a dynamic SQL statement that will only access the rows needed for the result, and then construct an XML fragment corresponding to the subtree selected by the XPATH expression.

XML Schema Validation and XML Parser

Virtuoso's XML parser can perform both DTD and XML Schema validation. Special parameters allow setting the level of error recovery, so that invalid XML can still be parsed. The parser also has a "dirty HTML" mode, in which non-well-formed HTML content will be corrected on the fly, compensating for most cases of unbalanced tags and the like. This is especially useful for many web-scraping applications and processing of user supplied HTML content, as in blog, news, and other popular web applications.

Powered By Virtuoso