« I can't draw | Main | Referer Rankology »

The programmer productivity front

Programming Language
Operating System
Cluster/Grid     <--- you are here
Knowledge Base
I looked at inbound traffic for a recent post and was surprised to see programming.reddit.com at the top of the list. I knew about Reddit before but not this sub-reddit. I checked it out and the articles were geeky-cool (for a programmer). But after a few days of reading I started to get an uneasy feeling about the place.

What was all this fretting about why nobody uses Lisp or functional languages? Haskell, ML, yikes. I felt like I'd been teleported back in time to my college days. Maybe this was an east-coast vs. west-coast thing? Reddit is in that Boston/MIT corridor, Paul Graham talks about Lisp all the time, are they really still worried about this stuff?

Language? Bah. The action is in the frontier after the OS.

Don't get me wrong, I love programming languages, and I have a soft spot for language design. I tried (and failed) to design a new language early in my career. I even have a collection of books about historical programming language design. I've seen huge productivity wins with better programming abstractions, and sure, picking nonconventional choices can often give you a leg-up over the competition.

Picking a language isn't just a personal choice though. It has to be tempered by the realities of how mature the platform is, whether you can hire people who will want to work in your language, how appealing your tech platform will appear to partners, investors, acquirers... Yahoo Shopping isn't written in Lisp anymore, they rewrote it. Of course.

But the productivity and development problems that I see building search and web apps just aren't happening at the language statement level.

Language statements generally live inside a program process. But coordinating all the pieces of communicating software across a modest 500-node application like Topix is a bitch, though.

I want a fast scratchpad for my 50 front-ends to be able to share, kind of like sys V shared memory, but networked. I want get, put, append, tail, queue, dequeue, infinitely scalable across some RAID-ish cluster. Billions of keys, petabytes of data, if I get something a zillion times a second from all the front ends it should adapt so it can serve that fast, but migrate stuff I never get to slower storage. Everything should be redundant, fault-tolerant, self-repairing and administratively scalable.

You end up building some version of this every time you make an eBay, Second Life, Hotmail, Bloglines, AIM, Google, Inktomi, Webfountain, Facebook, Flickr, Paypal, Youtube.

A zillion machines, a zillion concurrent connections, a big mess of data, never lose any of it, never go down, oh and the SLA is never take longer than 50ms to do anything. And be simple and fun to program on top of so the programmers can work on the actual app instead of spending all their time firefighting the cluster support layer.

We all keep cobbling together solutions for whatever app we happen to be writing out of ad-hoc clustered RDBMs, Reiser, Berkeley DBs, piles of coordination code and scripted admin.

Language innovations like Ruby are great, especially when they get some traction and acceptance so that you actually could use them if you wanted to. But all of the recent languages that get use have come out of individual eccentrics. They're incremental aesthetic exercises. They're also all more alike than different. Language innovation is basically done, and mostly has been for a long time.

Machine-level OS research died too, probably sometime in the 90's. Rob Pike, one of the inventors of Unix, put out a paper in 2000 called "Systems Software Research is Irrelevant."

Systems software research has become a sideline to the excitement in the computing industry...

Ironically, at a time when computing is almost the definition of innovation, research in both software and hardware at universities and much of industry is becoming insular, ossified and irrelevant...

What is Systems Research these days? Web caches, web servers, file systems, network packet delays, all that stuff. Performance, peripherals, and applications, but not kernels or even user-level applications.

Now after Pike wrote that he left Bell Labs and went to work at Google.

Of course. Google is doing more cluster OS research than anyone right now. You could argue that Google's technology success owes more to the block & tackle work of managing 500,000 servers than to little algorithms that power search and ad targeting. GFS, Map/Reduce, BigTable.

A smart researcher can write an ad targeting algorithm or some pagerank variant in a weekend. It's relatively easy to think up new algorithms; implementing them and getting them to run, especially for web-scale problems, is the hard part. Without the platform to develop and deploy against, it's like you're writing code on paper waiting for the computer to be invented so you can run your program.

It's too bad there isn't a standard platform for all this stuff, so we wouldn't all have to stop and write a new custom version every time we want to code something that will need more than a single machine to run on.

Peculiar distribution and economic dynamics -- giving the source to Unix away to universities -- lead to the entire industry eventually standardizing on the C/Unix/posix syscall OS model. GNU and Linux helped vastly here by obliterating the stranglehold that AT&T held over the technology, which was holding adoption back. New languages get scale by being free, so they can get critical adoption mass, bake their platform to maturity, and become viable, become socially acceptable by pragmatic users.

But we don't need a clone of SYSV or a free C compiler or a dynamic language with socially-acceptable syntax now. We need an industrial strength, hyper scalable cluster OS.

The problem is that the kind of eccentrics that gave us Unix, GNU, Linux, Perl, Ruby, aren't likely to be able to deliver here. Who has 500 machines in their garage and a million pageviews/day as a personal thorn in their side? Only companies have these problems, and when companies build a platform to solve the problem, the platform isn't general, and it's not given away.



Listed below are links to weblogs that reference The programmer productivity front:

» Forget language choice, its all in the cluster implementation from The Bleeding Edge
I dont have anything to add to this fantastic post (and its Friday, so Im tired) from Rich Skrenta on why language choice really doesn't matter that much so Im just going to pull a long quote and apologize. Subscribe... [Read More]

Comments (5)

Everything should be redundant, fault-tolerant, self-repairing and administratively scalable ... A zillion machines, a zillion concurrent connections, a big mess of data, never lose any of it, never go down, oh and the SLA is never take longer than 50ms to do anything.

Yep. I totally agree, Rich.

I wrote about something similar in my post, "I want a big, virtual database":


I think we are starting to see examples getting closer to this. BigTable and Amazon S3 may be examples of giant, distributed, reliable hash maps. MySQL cluster, while not trivial to administer and massively scale, might be an example of this with a more traditional database.

But, we remain a long way from a self-optimizing, self-repairing, massive virtual database that runs over whatever hardware nodes are made available to it. We are a long way from being able to install a database on a server cloud and easily use it like it was all running on one giant machine.


So where does Hadoop's present and future fit into this?

I think Hadoop is great and it's wonderful that effort is being put into this kind of free software. Yahoo's backing of the project is also certainly a good sign.

I personally tend to be a "middle adopter" of these sorts of things. I ran UnixWare well after BSDI was around, and then BSDI until Linux had been shaking out its bugs for the first few years. We choose perl for Topix in 2002, even though Python and Ruby both look a bit better in terms of aesthetics, they just couldn't compete at the time in the size of the libraries or the maturity of the platform. Others can can go point on beta testing this stuff. :-)

There've been lots of promising distributed filesystems and cluster technologies that never seem to have gotten "mainstream" adoption. Andrew filesystem (because it was commercialized), Coda, and the Beowulf cluster management and IPC stuff. We're inching forward, but there just isn't any off-the-shelf stuff yet that lets you stuff a web crawl into the DB and then serve the index live against it, or easily manage pools of machines with stuff like network shared memory. Skip the high-flalutin' problems, even crunching the gigabytes of logs from a large website each night is a PITA.

Amazon's EC2 is a place that individual programmers / developers could go for reasonable size infrastructure. Future virtualized server farms will be even cheaper. Zombie PCs / botnets are probably a good test bed for a certain class of developer.

I'm sure you know about memcached - distributed in-memory caching. It seems more immediately useful than Mysql cluster or beowulf. It's open source.

But grid management / deployment / provisioning, etc. - I think that's an area google and large companies do have a big advantage in, and it's not something easy to access yet.


i think its important not to generalize and oversimplify here. what appears to be a generic clustering system is often tied closely to the application space. certainly with the major search engines, their topology is driven by their needs. one can look at toolkits like globus etc, but this is just free code, you have to provide your own hardware and ops.

i think it would behoove someone like ibm or microsoft to enable semi-public clusters for authorized interested developers, with some reward accruing to the infrastructure provider should the new projects gain traction (equity in exchange for grid time)...in this sense too, these vendors could learn about how to adapt clusters to the widest class of applications. get a relationship with a large VC firm in here and you have a pipeline to getting larger ideas off the ground faster.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


This page contains a single entry from the blog posted on January 14, 2007 12:23 PM.

The previous post in this blog was I can't draw.

The next post in this blog is Referer Rankology.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33