Virtuoso is the offspring of a genealogy tree comprising Universal Data Access Middleware from OpenLink Software and Kubl RDBMS (Relational Database Management System) Engine by Orri Erling.
OpenLink Universal Data Access Middleware
In 1992 Kingsley Idehen departed Unisys (in the United Kingdom) to set up a consulting firm called PAL Consulting Ltd. (which stood for Products, Applications, and Languages consulting). His initial aim was to set up a consulting firm that specialized in data access middleware and associated integration services; a natural progression since he possessed significant knowledge of all the major RDBMS engines (at the time), and every first generation data access middleware product. Unfortunately, the competitive realities of partnering with some early middleware vendors forced him change course and defend his vision and passion for data access middleware. The net effect of this strategic change in direction was the creation of OpenLink Software (by renaming PAL Consulting after its emerging collection of ODBC Drivers).
In the early 90's, OpenLink Software earned a stellar reputation in the data access middleware market by delivering secure and high-performance ODBC Drivers for a range of industry leading databases. These drivers were popularly known as the "OpenLink High-Performance" ODBC Drivers, and were the only drivers at the time capable of dispelling the "ODBC means Poor Performance" myth. The company is also responsible for the very first port of ODBC on Linux, and other UNIX variants (called: UDBC). It also developed the very first ODBC Driver for Postgres, and has pursued the purity of platform independent data access from its very inception. It combined its initial UDBC effort with a development effort with Ke Jin to create what is known today as iODBC, the earliest known Open Source project associated with DBMS vendor independent data access.
By 1997 the company's initial portfolio of platform independent ODBC specific data access middleware had grown to include drivers for JDBC, and OLE-DB; all available in a variety of architectural formats (Single-Tier and Multi-Tier).
The net effect of the VIA/DRE experience, combined with the completion of the Distributed Computing Platform, spurred Orri toward a new goal. He embarked on a scientific pursuit, pushing the boundaries of SQL, aiming to achieve the lofty goal of the fastest OLTP-oriented DBMS engine that could possibly be written. This effort bore fruit, in Kubl.
While working on Kubl, Orri was also engaged with Infosto Group, Finnish publisher of Keltainen Porssi, a "Consumer to Consumer" (C2C) national marketplace journal. He was tasked with technical oversight of Infosto's participation in the United Nations TradePoint initiative. TradePoint's aim was to help network small businesses worldwide via an online marketplace on the then-nascent Web. This effort became the first full-blown application of the Kubl DBMS Engine, which by then was a client-server RDBMS core with basic SQL, transactions, and a stored procedure language. This project was soon followed by a web version of Keltainen Porssi, which quickly gained 500.000 registered users out of a demographic of only 5 million. As Finland of the mid 90s still had relatively little household Internet use, the market penetration of this effort was remarkable.
In early 1998 Kingsley Idehen, Founder & CEO of OpenLink Software, felt the time was right to implement the next stage of OpenLink's long-standing vision to provide complete separation of Application Logic, Data Access, and Data Storage via a "Virtual Database Engine". The company had already successfully developed and deployed high-performance data access drivers for ODBC, OLEDB, and JDBC that served traditional "Client-Server" applications, but it now had its sights on addressing the same problems in the new architectural context presented by the burgeoning World Wide Web. The main issue it had to address was the traditional "Build vs. Buy" approach to building the Virtual Database Engine that it sought. In attempting to resolve the aforementioned dilema Kingsley decided to scour the internet for the existence of an ODBC CLI compliant Linux based SQL RDBMS Engine. An engine that could be potentially morphed into a combined Middleware and DBMS product.
The extensive search and evaluation process lead Kingsley to Kubl, and ultimately Orri. Kingsley pitched the OpenLink Virtual Database vision to Orri and within 4 weeks of their meeting OpenLink Software acquired Kubl and the Virtuoso project was born.
As a vendor of high-performance ODBC Drivers covering all major backend DBMS engines, OpenLink Software was uniquely positioned to build a Virtual DBMS Engine that truly abstracted data access across heterogeneous databases without the expected challenges of obtaining functional drivers, or the political hang-ups of a traditional DBMS vendor. In addition, the compatibility of technical prowess, vision and passion made the assimilation of Virtuoso much easier than would be expected in conventional company and personnel mergers. Thus, in the fall of 1998 the first release of the Virtuoso VDBMS Engine was unveiled. You now had a single high-performance ODBC, JDBC, or OLEDB based access point for transparently connecting compliant Desktop Productivity Tools and Application Development Environments (Traditional and Next Generation Web Oriented) to any ODBC- or JDBC-accessible backend database.
Iobox (one of the earliest hosted internet services) was one of the web applications/services built using Virtuoso. It served up to half a million registered users from a single Sun SPARC 300MHz server with 32G of disk and catered for about 3000 concurrent sessions. With the addition of more application logic in PHP, the data center became larger and the end user count reached 1 Million. Telefonica eventually acquired Iobox for about $300M in 2001. OpenLink was a contracted developer and software vendor for the Iobox project.
In early 1999, XML started to emerge as the preferred candidate for standardizing data representation for data access, protocol definition, and data modeling. OpenLink responded to the impending XML market inflection by adding XML support to Virtuoso. This included the following database engine hosted capabilities; Validating XML Parser (supporting XML Schema), XSLT processor, SQL-XML transformation, XQuery, XPath, and a native XML datatype (long before this functionality became roadmap and implementation items across Oracle and MS SQL Server, DB2 and any other major DBMS engines). Initial SQL-XML transformation functionality took the form of a XML to SQL Schema Mapping system called XML Views that also preceded Microsoft's schema mapping system. It turned relations into XML and XPATH into SQL, allowing querying of relational model data (tuples) using hierarchical data model navigation.
Virtuoso also supported shredded storage of XML for a short while but this was later dropped in favor of document-style XML storage and XML specific Free Text indexing. OpenLink never fully subscribed to the idea of XML being a storage format upon which transactional applications would be built. It believed, as time has demonstrated, that XML would be the format of choice for data exposure and exchange. Thus, its XML technology exploits remain centered on the position that XML (via "Universal Views") is the preferred solution for unshackling "Logical Data Representation" from the confines of database model and engine specificity.
By exposing native and third party SQL Data as XML based views hosted by the Virtuoso Database engine, it was clear to OpenLink that it had to surmount the metaphor level impedance challenge associated with exposing DBMS engine resident XML views to XML developers and users that sought to interact with physical documents that were URL accessible. In response to this reality, Virtuoso's SQL-XML functionality was extended to include generation of HTTP accessible XML documents "on the fly". The XML documents (Valid or Well-Formed) that are conditionally sensitive to changes in source databases. Naturally, this feature raised some security concerns that ultimately led to the addition of WebDAV and Virtuoso specific ACL security to Virtuoso's in-built Web Server.
In addition to its sophisticated XML functionality, Virtuoso also possesses its own dynamic web page generation language called VSP (Virtuoso Server Pages). The language enables the development of dynamic web pages using Virtuoso's SQL procedure language interspersed in HTML (as is the case with PHP, ASP, JSP, and others). It also includes a declarative XML based dynamic web page generation language called VSPX, which is conceptually similar to ASP.NET. A key advantage that VSP/VSPX pages bring to bear is the fact that SQL data access occurs in-process, as long as the tables being accessed are stored in Virtuoso.
By 2000-2001, SOAP (Simple Object Access Protocol) started to gain prominence as the foundation protocol of what is known today as the WS-* protocol stack. This protocol stack catalyzed the nascent Web Services paradigm (including Service Oriented Architecture and ESB derivative paradigms). As Virtuoso already possessed DBMS hosted (in-built) HTTP/WebDAV support (its web server functionality), and a very powerful SQL Stored Procedures language, it was a very natural progression for the product to expose functionality that would simplify comprehension and exploitation of the burgeoning Service Oriented Architecture (SOA) paradigm. The following features resulted:
- Execution of native or third party SQL Stored Procedures over HTTP
- Execution of native or third party SQL Stored Procedures via SOAP (by publishing SQL Stored Procedures as WSDL and SOAP compliant Web Services)
- Proxy Generation for third party Web Services
- Exposure of saved SQL, SQL-XML, XQuery, and XPath queries as Web Services
Orri's Lisp and AI background provided Virtuoso with a rich programming language endowed with runtime data-typing and self-hosted compilation making the incorporation of SOAP within an already powerful technology cocktail a triviality. In addition, the Database / Data-Access Middleware essence of Virtuoso synergistically extended the SOAP functionality to include Stored Procedures hosted in any third party ODBC accessible DBMS engine. Over time Virtuoso has continued to implement relevant portions of the WS-* protocol stack as and when they are published. It had one of the very first implementations of WS-Security and till this very day the only DBMS hosted implementation of the Business Process Execution Language for Web Services (BPEL nee. BPEL4WS).
By the summer of 2001, Virtuoso underwent a major engine-rewrite that re galvanized an erstwhile dormant aspect of Virtuoso, namely its Object-Relational DBMS functionality. Outputs from this major development effort include:
- DBMS hosting of the Java, Microsoft .Net CLR, and Mono runtime environments
- User Defined Type (UDT) support with implementation in SQL, Java or .net
- Use of UDTs for abstracting web services, generating a UDT from a WSDL.
- SQL Stored Procedures extensibility via code associated with hosted runtimes
- Dynamic Language & Web page hosting for PHP, ASP.NET, JSP, Python, Perl, and Ruby (a recent addition)
- Procedure Views (Table Valued Functions in SQL Server and Table Functions in Oracle)
- Improved Cost based Distributed Query Optimizer for handling heterogeneous SQL joins
- Bi-Directional Transaction Replication
- XA-based 2-Phase Commit for Distributed Transactions
The combination of SQL-ORDBMS, HTTP/WebDAV, Web Services Platform for SOA, and other functionality realms covered by Virtuoso, led to fundamental incompatibility between the product moniker "Virtual Database" and the actual product feature set. Re-branding Virtuoso as a “Universal Server” ultimately alleviated the aforementioned product-branding challenge; a single server product offering that implements a plethora of industry-standard protocols.
From a traditional marketing and positioning point of view Virtuoso in its Universal Server is a complete renegade by virtue of the fact that it straddles a number of functionality realms concurrently. At the same time, a closer look at the product architecture and its evolution , unveils deep understanding and anticipation of the platform requirements that will ultimately define next generation Intranet/Internet/Extranet (SOA, Web 2.0, Semantic Web / Data Web, and beyond) solutions. This is all the more so as these solutions will demand loose coupling of application logic, separation of data access & data storage, with an inherent need for collaboration driven integration that comes naturally to Virtuoso.
Virtuoso and the DataSpace Frontier
It is no secret that Virtuoso is a product way ahead of its time. OpenLink, as a result of finding the newest member of its product portfolio in the enigmatic status of "Best Kept Secret", decided to address the emerging product comprehension and uptake lull to further test and enhance Virtuoso by embarking upon a number of internal application development projects. The goal of this effort was dual pronged: build next generation applications that showcased the inherent prowess of Virtuoso, and "Dogfood" the resulting applications (which would benefit the company anyhow) prior to release. The net effect of this decision was a gradual change in engineering focus. No more pursuit of high-end database functionality aimed solely at winning TPC benchmarks (such as transparent shared nothing clustering). Instead, the focus turned to building a suite of collaboration oriented solutions aimed at showcasing not only the prowess of Virtuoso, but the potential of emerging paradigms such as Web 2.0 and the "Semantic / Data Web". The applications that emerged from this application development effort include:
- Web Application Framework
- Weblog Publishing Platform
- RSS/Atom/RDF Feed Aggregator
- Photo Sharing system
- Discussion Server
- Wiki Engine
- BPEL Process Manager (application layer above the in-built BPEL core)
- Unified Storage (that includes automatic metadata extraction and resource classification using RDFS and OWL)
- Social Networking Framework
As was the case when Virtuoso transitioned from a Virtual Database engine to a Universal Server that included Virtual Database functionality, the suite of Virtuoso applications posed a number of marketing related product branding challenges. Ultimately it was decided that the Universal Server would be packaged as a platform solution (workbench of sorts) while the applications would be packaged as discrete parts of a collective DataSpace (since all of the data and application logic that expressed the application behaviors resided in a Virtuoso Database).
Having completed a slew of unreleased Virtuoso applications (as enumerated above), it was then decided in late 2005 that the time was right for finalizing engineering work that would unveil and revisit a deferred development effort aimed at the "Semantic Web". There emergence of a definitive query language for RDF Data Management systems called SPARQL, accompanied by a SOAP based protocol also called SPARQL, provided OpenLink with justification for resuming its long stalled efforts on the RDF data storage front. Thus, from late 2005 to early 2006, RDF has been a major development focus of the core Virtuoso engineering team. The end result of this effort is a full-blown SPARQL to SQL translator (or rewriter) that successfully completes a majority (90 percentile) of tests in the DAWG (RDF Data Access Workgroup) SPARQL test suite. The challenges posed by the Semantic Web vision compliment the essence of Virtuoso's prowess and OpenLink's engineering skill set. After all, an RDF Triple store is fundamentally a specialized database challenge that draws on deep database engineering and query optimization know-how.
A large part of Virtuoso was released in open-source form (dual-license mode similar to MySQL) in April 2006 with the following strategic goals in mind:
- Developing and Fostering community around Virtuoso
- Making a significant contribution that potentially accelerates comprehension and exploitation of the "Semantic Web" / "Data Web" vision
- Providing technology in a form palatable to early adopters and the extensive RDF research community
- Quicker product release cycles
- Bury the "Best-Kept Secret" and generally enigmatic status of Virtuoso