« Updated data from Topix on registration-free commenting | Main | Database gods bitch about mapreduce »

Open source Bigtable clone 'Hypertable' posts performance numbers

Zvents will soon be releasing their open-source Bigtable clone called Hypertable, and have posted some performance numbers that look quite good. Especially for such an early release.

But maybe that not surprising since Hypertable was designed by Zvents search architect Doug Judd for speed. He rejected Java (used by HBase, the Hadoop-project Bigable effort) in favor of C++ in order to get the performance as high as possible.

With a small test inserting about 28M rows of data from the AOL search dataset, they achieved a per-node write rate of approximately 7mb/sec. Iteration over the data once loaded was also quite fast, at nearly 1M cells/second.

The question is how the system will scale up to much larger amounts of data. But the early perf numbers are encouraging. Doug and co will also need to get the word out about Hypertable and get a developer community going around this project if it's going to achieve its full potential.

Hypertable can run on top of either HDFS or KFS. Zvents CEO Ethan Stock told me they will be releasing it under GPL 2.1 on Jan 31th.

Comments (4)

"He rejected Java in favor of C++ in order to get the performance as high as possible""... I can't believe such arguments still surface in 2008.

Modern JVMs are incredibly efficient with runtime optimizations driven by the actual behaviour of the application that C++ programs will never achieve because of the static compilation.

A valid argument might be the need to access very low-level system stuff that aren't exposed in the Java library, but not performance.

milostea:

Please, what Cool-Aid have you been drinking? I have developed apps in both C++ (15 yrs) and Java (8 yrs) and the JVMs simply cannot compete in performance when placed next to a native C++ application.

Stu Hood:

It's great to have options! The Hadoop project can now be operated with either C++ or Java implementations of the filesystem (KFS/HDFS) and a column oriented database (Hypertable/HBase).

The competition and sharing between these 4 projects will be a very good thing in the long run.

Ah, ah.... nice, nice.... but why GPL? Why not ASL or BSD or something along those lines!?

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on January 15, 2008 4:14 AM.

The previous post in this blog was Updated data from Topix on registration-free commenting.

The next post in this blog is Database gods bitch about mapreduce.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33