From: Steve Grimm <...
Subject: Re: Largest production memcached install?
No clue if we're the largest installation, but Facebook has roughly 200 dedicated memcached servers in its production environment, plus a small number of others for development and so on. A few of those 200 are hot spares. They are all 16GB 4-core AMD64 boxes, just because that's where the price/performance sweet spot is for us right now (though it looks like 32GB boxes are getting more economical lately, so I suspect we'll roll out some of those this year.)
We have a home-built management and monitoring system that keeps track of all our servers, both memcached and other custom backend stuff. Some of our other backend services are written memcached-style with fully interchangeable instances; for such services, the monitoring system knows how to take a hot spare and swap it into place when a live server has a failure. When one of our memcached servers dies, a replacement is always up and running in under a minute.
All our services use a unified database-backed configuration scheme which has a Web front-end we use for manual operations like adding servers to handle increased load. Unfortunately that management and configuration system is highly tailored to our particular environment, but I expect you could accomplish something similar on the monitoring side using Nagios or another such app.
At peak times we see about 35-40% utilization (that's across all 4 CPUs.) But as you say, that number will vary dramatically depending on how you use it. The biggest single user of CPU time isn't actually memcached per se; it's interrupt handling for all the incoming packets.
From: Paul Lindner <...
Don't forget about latency. At Hi5 we cache entire user profiles that are composed of data from up to a dozen databases. Each page might need access to many profiles. Getting these from cache is about the only way you can achieve sub 500ms response times, even with the best DBs.
We're also using memcache as a write-back cache for transient data. Data is written to memcache, then queued to the DB where it's eventually written to long-term storage. The effect is dramatic -- heavy write spikes are greatly diminished and we get predictable response times.
That said there's situations that memcache didn't work for our requirements. Storing friend graph relations was one of them. That's taken care of by another in-memory proprietary system. At some point we might consider merging some of this functionality into memcached including:
I'd be interested in working with others who want to add these types of features to memcache.
- Multicast listener/broadcaster protocols
- fixed size data structure storage
(perhaps done via pluggable hashing algorithms??)
- Loading the entire contents of one server from another.
(while processing ongoing multicast updates to get in sync)