« Re: How to beat Google | Main | The Architecture of Mailinator »

Conservative coding

An expat investment banker in Brussels once told me that two non-native english speakers can often converse far more easily in English than a native and non-native speaker. That's counter-intuitive? Shouldn't the pair with the native speaker have an easier time?

It turns out that native speakers use a far broader footprint of the language, and reference all sorts of cultural idioms when they speak. And so the non-native speaker has no idea what they're talking about. But two non-native speakers are both using a smaller, common, conservative subset, so there are fewer misunderstandings.

* * *

Everything at topix is written in perl. That sometimes elicits the "What's up with that?" from techies. "Perl looks like line noise. Isn't your code hard to maintain?"

Well, as hard as anyone's I guess, but not because of the language.

We do crazy fun stuff in our system, like mmap'ing giant files with key-offset indices at the front, pulling out chunks of data, decompressing them, and thawing them into perl objects. We can do something like 6,000 of those a second on a regular box. We now have a scalable get/put service based on that running on a 500 node cluster. We do named entity disambiguation and all sorts of text analytics in perl. Performance isn't an issue, not from the language anyway. We worry about disk seeks and network latency and stuff like that. But not statement execution. There are a handful of functions that got written in C but it's pretty tiny.

"What about python and ruby?"

I think that anyone using perl, python or ruby is about 100X more productive than someone working in Java or C++. Within the three I don't really have strong opinions though.

If you choose to deliberately limit yourself to a subset of whatever language you're working in, code can pretty much come out looking the same in all three.

Trouble starts when you try to get fancy.

I see gee-whiz programmers often gleefully code wonderful stuff that no-one else can make heads or tails of. Certainly not the new junior engineer we just hired who was a sharp coder in two other languages, but just started learning perl a few weeks ago.

And the gee-whiz stuff doesn't buy much. You can trim out a few lines here or there, but often the complexity is more at the greater system level, and the performance has to do with the systems and algorithmic stuff. Obfuscating a few lines to leverage a language trick doesn't actually benefit the system, and it certainly doesn't benefit the other members of the team who might have to pick up that code later. Coding is social, it's not just a private dialogue between you and the machine.

I've known a lot of languages in my career. I've studied language design and written compilers. I see big productivity differences between classes of languages, but within the classes, not so many. But folks always seem to get religious about one vs. the other. Frankly, it's a red-flag. It signals idealism over pragmatism, a love of a particular toolset over a focus on the goals of the project.

Ulysseys is great if you want phd english majors to study your work for years to figure out what it means. But put the five dollar words away when you're writing the install guide for your new blogging package. Coding is the same way. Put the fancy stuff away and code for the rest of us mortals.

Comments (10)

"I think that anyone using perl, python or ruby is about 100X more productive than someone working in Java or C++."

Where does PHP fit in there?

Yeah, PHP too. It's in the same class of interpreted scripting languages.


"I think that anyone using perl, python or ruby is about 100X more productive than someone working in Java or C++."

Wouldn't that depend on what it is that they're working on? In what environment? As always, the right tool for the job is an important thing to consider.

Trouble starts when you try to get fancy.

This is probably the most concise explanation of how bad code happens that I've read in some time. If you avoid the fancy stuff, you can write quality, maintainable code in any language.

I was just at the colo dropping off a new machine and couldn't help but notice your multiplying servers since they fill the empty space that was once in front of our measly rack, It seems like a few months ago there was only one tower of machines.

I was thinking to myself "man what do they do with all those things?" I guess I know now. Maybe some day I can have a rack of servers like that...

Yeah, if you're writing a unix device driver, you're not going to be doing that in perl. :-) But there's a lot more code than most programmers think that can be done in a higher level language.

Say you're going to write some magic GFS like network distributed filesystem, and want to be able to mount it. Sure you're going to have kernel stub to do the filesystem/VFS stuff, but I'd totally look at throwing the requests down to user space and dealing with most of the complexity in some daemon where you have a bigger lever. Could you write that in perl or python instead of C/C++? Sure. Things get so much easier to debug, no kernel panics, you'll code faster, the network and disk latency are your primary bottlenecks.

Any one app, who knows, you're right you have to judge things case by case. But defy convention for a productivity win and an edge over your competitors. :-)

"I think that anyone using perl, python or ruby is about 100X more productive than someone working in Java or C++. Within the three I don't really have strong opinions though."

For the average perl/python hacker vs avg java hacker I'd almost certainly agree.

The problem is that Java has a LOT of consultingware packages like EJB which are designed to waste money.

That said. The BEST programmers I know are using python, java, and C these days.

Tailrank is Java ;)

The real hidden secret is that large and scalable systems are somewhat independent of language. You mostly have to deal with threading or IO systems.

And yeah.... flat files are a Zen that most junior engineers don't yet grok.

I've been meaning to finish open sourcing the distributed filesystem I've been working on for the last two years ;)


I don't see how speaking about dev productivity in the context of medium to large systems has any value unless you also speak about TCO and maintainability.

Not getting fancy while coding is fine and good, but in my opinion, the problem with the class of interpreted languages is that the software that comes out the other side tends to be much less structured and tends to have much less abstraction. Finally on average you would tend to end up with a lot more spaghetti which over time is more expensive and more time consuming to enhance and/or maintain.

Its possible to mitigate these issues with strong design and coding practices, but starting out with a more structured, strongly typed, compiled class of languages seems like a bit of an insurance policy against such problems and probably a wise investment of one's resources.

I've seen plenty of spaghetti in Java and C++. Nailing down a variable to 'int' is no defense against big system horrors. My experience has been that the statement, function and module features of languages which were designed to assist with good structuring of code really are only a minor help. You can have plenty of confused overlapping abstractions with well defined header files and module interfaces.

Source line count is the biggest driver of problems IMO. If you can have 1/10th of the SLOC, it's easier for new folks to learn the code base, there are less lines of code for bugs to hide in, the whole system is just easier to manage.

Furthermore, if you can make an individual programmer more productive, you can have less of them. Less programmers, less n^2 communication overhead, project scaling factors, etc.

Well, I must add that productivity on the hands of the Delphi language is a given. The Delphi community has being always aware of its power and manage it as their secret weapon. I am always amazed on seing how Java and C++ required so many lines of code to accomplish what we do with a few. This is based on experience due to the fact that I direct already two other projects using Java, and GOSH, it hurts.

Now that CodeGear added the Delphi power to PHP, with their new RAD, the mix is simply amazing. Add also the new VCL for the WEB and mass quality development is right there at your finger tips.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


This page contains a single entry from the blog posted on March 29, 2007 8:28 AM.

The previous post in this blog was Re: How to beat Google.

The next post in this blog is The Architecture of Mailinator.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33