Archive for February 18th, 2008

Time to rewrite DBMS, says Ingres founder

Monday, February 18th, 2008

In a paper titled The end of an architectural era (It’s time for a complete rewrite), Mike Stonebraker, Ingres founder and a Postgres architect, with a group of academics said that modern use of computers renders many features of mainstream DBMS obsolete.

They have argued that DBMS designs such as Oracle and SQL Server come from an age when online transaction processing (OLTP) dominated and required techniques such as multi threading and transaction locking. They said that modern transactions - entered via web pages - do not need these expensive processing overheads and DBMS should, therefore, be re-designed without them. Persistent storage such as disks are also seen as unnecessary and could be replaced by geographically dispersed RAM storage.

Stonebraker and his group also advocate abandoning SQL because they see no need for a separate data manipulation language. Data manipulation, they said, can be performed with other tasks using languages such as Ruby. They describe a prototype DBMS called H-Store that embodies these ideas.

This paper is a very interesting read, and basically acknowledge the hard work that DataGrid providers such as GigaSpaces and Tangosol have been advocating for a long time. One thing that I do have to comment regarding the article is the fact that Mr. Stonebraker and the rest of the group fail to take this architecture to the next level of integrating such a solution into the “application tier”. I guess this is mainly due to the fact of the “remote database” concept that is inherit when working with databases.

What do I mean by that? Very simple. Once we have our data stored in memory, we can bring it “into” our application tier. This means that operations that we perform will actually be done in memory without even leaving our “vm”. Naturally, the next question that is asked then is what do you do with partitioning? Well, the idea is to have the processing of data redirected into the partition that will hold most (if not all) the relevant data that is required for its processing (the one that is not can still be accessed in a remote “clustered” manner).

Another interesting point is the replacement of SQL with better ways to query for data. For one, the simplest thing can be to define our queries based on the objects we work on. For example, create a “template” of an Order where its processed flag is set to false. Advance queries can be based on dynamic languages such as ruby and groovy, which is exactly what I have been hacking around in GigaSpaces for our upcoming version (more information can be found here).

Its great to see this movement starting to happen within the database world.

Embedded TopLink Essentials (Glassfish)

Monday, February 18th, 2008

Compass 2.0 M1 supports an embedded mode when working with TopLink Essentials (which its development is part of the Glassfish project).

Here are the simple steps needed to enable Compass with TopLink Essentials:

First, add the following to your persistence xml file:

1
2
3
4
5
6
7
8
<persistence-unit name="test" transaction-type="RESOURCE_LOCAL">
  <provider>oracle.toplink.essentials.PersistenceProvider</provider>
  <properties>    
    <!-- ... (other properties) -->
    <property name="toplink.session.customizer" 
         value="org.compass.gps.device.jpa.embedded.toplink.CompassSessionCustomizer" />
  </properties>
</persistence-unit>

This will enable Compass within TopLink Essentials, basically going over all the mapped JPA classes and adding them to Compass automatically if they have the @Searchable annotation.

Now, if we want to completely index the database based on the mappings, we can execute the following code:

1
TopLinkHelper.getCompassGps(emf).index();

Last, if we want to perform a search, we can simply obtain the Compass instance and perform it. Here is the code:

1
Compass compass = TopLinkHelper.getCompass(emf);

That is it. Simple no? Now, Compass comes with support for embedded OpenJPA, Hibernate, and TopLink (as well as EclipseLink, which is very similar to TopLink). More information on the integration can be found in the reference manual.