« AppEngine - Web Hypercard, finally | Main | Web robot names considered, and rejected »

Cluster map propagation in Amazon Dynamo

Dynamo is Amazon's scalable key/value storage service. The paper is a good read, but I found the way the cluster node list information was propagated in dynamo to be a little odd. The algorithm is that every 60 seconds a node will talk to another node in the cluster, chosen at random, and exchange update information. I wondered how fast a change would propagate through the cluster, so I simulated the propagation.

For a 5,000 node cluster it takes about 9 update cycles for a change to reach every other node. Since each update is on a 60 second timer, that's 9 minutes for a change to push out.

I didn't do a very sophtisticated time model..plus there is random start and all that. So maybe in practice it's a little different. But 9 minutes seems like a long time to propagate a host change out to the rest of the cluster. Maybe I mis-interpreted what they're doing?

I recall some confusion about whether Dynamo was actually providing SimpleDB, or if they were two separate software systems. Does anyone know if this was resolved?

Comments (2)

Stu Hood:

Where did you see the 60 second interval? The only quote I was able to find in the paper about the frequency of the gossip exchanges was:
> Each node contacts a peer chosen at random every second and the
> two nodes efficiently reconcile their persisted membership change histories.

They do also mention that the nodes will actively exchange full routing tables, which could be the 60 second interval you refer to.

The use of both methods could explain the infrequency of the full table exchange.

Manuel Simoni:

Section 4.8.1 of the paper says that peers exchange the information every second, not every 60 seconds.

I believe Dynamo and SimpleDB are different systems -- Dynamo is written in Java and IIRC SimpleDB is Erlang.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


This page contains a single entry from the blog posted on April 14, 2008 10:34 AM.

The previous post in this blog was AppEngine - Web Hypercard, finally.

The next post in this blog is Web robot names considered, and rejected.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33