Time to rewrite DBMS, says Ingres founder
In a paper titled The end of an architectural era (It’s time for a complete rewrite), Mike Stonebraker, Ingres founder and a Postgres architect, with a group of academics said that modern use of computers renders many features of mainstream DBMS obsolete.
They have argued that DBMS designs such as Oracle and SQL Server come from an age when online transaction processing (OLTP) dominated and required techniques such as multi threading and transaction locking. They said that modern transactions - entered via web pages - do not need these expensive processing overheads and DBMS should, therefore, be re-designed without them. Persistent storage such as disks are also seen as unnecessary and could be replaced by geographically dispersed RAM storage.
Stonebraker and his group also advocate abandoning SQL because they see no need for a separate data manipulation language. Data manipulation, they said, can be performed with other tasks using languages such as Ruby. They describe a prototype DBMS called H-Store that embodies these ideas.
This paper is a very interesting read, and basically acknowledge the hard work that DataGrid providers such as GigaSpaces and Tangosol have been advocating for a long time. One thing that I do have to comment regarding the article is the fact that Mr. Stonebraker and the rest of the group fail to take this architecture to the next level of integrating such a solution into the “application tier”. I guess this is mainly due to the fact of the “remote database” concept that is inherit when working with databases.
What do I mean by that? Very simple. Once we have our data stored in memory, we can bring it “into” our application tier. This means that operations that we perform will actually be done in memory without even leaving our “vm”. Naturally, the next question that is asked then is what do you do with partitioning? Well, the idea is to have the processing of data redirected into the partition that will hold most (if not all) the relevant data that is required for its processing (the one that is not can still be accessed in a remote “clustered” manner).
Another interesting point is the replacement of SQL with better ways to query for data. For one, the simplest thing can be to define our queries based on the objects we work on. For example, create a “template” of an Order where its processed flag is set to false. Advance queries can be based on dynamic languages such as ruby and groovy, which is exactly what I have been hacking around in GigaSpaces for our upcoming version (more information can be found here).
Its great to see this movement starting to happen within the database world.
February 19th, 2008 at 2:13 am
Shay
Great post.
One thing that is important to note though is that data bases are probably not going to disappear from the world anytime soon. Their role in our application architecture is going to change quite fundamentally as you rightly suggested. It is therefore important to mention that while while our system of record can be now stored safely in-memory using IMDG (In Memory Data Grids) it can still be integrated and synchronized with existing data bases. A great deal has been invested on figuring out what should be the right way of doing that in order that we wouldn’t loose the performance benefits of IMDG on one hand and the consistency of the data of our existing data base on the other hand. For that purpose we came up with a model which i refer to as PaaS = Persistencey as Service. Actually Guy Nirpaz came up with a more creative name for it - Eventually Persistent:)