Season's Greetings from Virtuoso Development

It's been a long and very busy time since the last blog post.

Now and then, circumstances call for a return to the contemplation of first principles. I have lately beheld the Platonic ideal of database-ness and translated it into engineering elegance. No quest is static and no objective is permanently achieved.

Accordingly, I have redone all Virtuoso core engine structures for control of parallel execution. As we now routinely get multiple cores per chip, this is more important than before. Aside from dramatic improvements in multiprocessor performance, there is also quite a bit of optimization for basic relational operations.

Of course, this is not for the pure pleasure of geek-craft; it serves a very practical purpose. RDF opens a new database frontier, where these things make a significant difference. In application scenarios involving either federated/virtual database or running typical web applications, the core concurrency of the DBMS is not really the determining factor. However, with RDF, we get a small number of very large tables and most processing goes to these tables. This is also often so with business intelligence but it is still more so with RDF. Thus the parallelism within a single index becomes essential.

We have also made a point by point comparison of Virtuoso and Oracle 10g for basic relational operations. Oracle is very good, certainly in the basic relational operations like table scans and different kinds of joins. As a matter of principle, we will at the minimum match Oracle in all these things, in single and multiprocessor environments. The Virtuoso cut forthcoming in January will have all this inside. We are also considering making and publishing a basic RDBMS performance checklist, aimed at comparing specific aspects of relational engine performance. While the TPC tests give a good aggregate figure, it is sometimes interesting to look at a finer level of detail. We may not be allowed to give out numbers in all cases due to license terms but we can certainly make the test available and publish numbers for those who do not object to this.

Of course, RDF is the direct beneficiary of all these efforts, since RDF loading and querying basically rests on the performance of very relational things, such as diverse types of indices and joins.

More information will be forthcoming in January.

Merry Christmas and productive new year to all.