« November 2007 | Main | January 2008 »

December 2007 Archives

December 1, 2007

EC2 - the return of timesharing

I continue to be surprised at the success of EC2/S3.I know a lot of startups using it and I can imagine this sort of machine/storage virtualization taking over a big part of the datacenter/colo market. The appeal of built-in server financing, and the ease of scaling up or down are so compelling that folks are willing to work around the (pretty severe) limitations in the current service. (Of course the product is still very young and amzn will continue to improve it over time.)

Anyone who's ever tried to get financing or leaseback for machines knows what a pain it can be and how difficult it can be to qualify. EC2 makes all that pain go away, you can have 1 or 20 servers and scale up or down at a moment's notice. It's really more financial tech than datacenter magic.

I wonder if some kind of standardization for how to deploy virtual nodes and storage is going to develop. Presumably if there are other companies that are going to jump into the virtual datacenter market their APIs aren't going to look examctly like Amazon's.

I heard that Amazon's EC2/S3 service is getting a lot of calls from law enforcement because it's being used to host kiddie porn and file sharing services. Apparently being able to set up storage and compute farms from a web form with a credit card on in someone else's datacenter is pretty appealing to folks who don't want to get caught for what they're hosting.

Of course you expect this sort of thing, it's just a cost of doing business, like running a big forum system.Any big ISP or community site has dedicated staff to handlethe law enforcement requests and to police the userbase.Interesting that you have to do this sort of thing even to run a virtual datacenter product.

December 3, 2007

Weird Stuff - the other Silicon Valley tech museum

The Computer History Museum at Shoreline & 101 is pretty cool. If you're in the area and haven't been it's worth a visit. Seeing all the old electronics and panels covered with switches delivers a strong shot of nostalgia mixed with awe at the rate of progress computing hardware has made over the years. They've got the guidance computer from the nose-cone of an old Minuteman nuclear missle, a Dec-10, an Enigma machine, an IBM 360, lots more great stuff. And the place has that distinctive smell of old electronics. The smell of vacuum tubes and bakelite. :)

But there is another, unofficial, computer history museum, about five minutes away in Sunnyvale. Weird Stuff, a surplus equipment store. If you want to take some of the artifacts of the old Valley home with you, this is the place.

I bought an old working 1U server for $65, a rack for $50, some crazy old line tester thing with a hundred switches for $10, and a mechanical typewriter for a few bucks (it didn't have a price on it, and the checkout dude said "how much do you want to pay for it? ok.")

They get batches of semi-newish equipment too, so it's a great place for deals on telco racks, routers, switches, laser printers, patch panels, etc.

But I go there to see artifacts of the old valley. Half the crazy old devices on their shelves were someone's startup dream at some point. An old 1U firewall box that originally sold for 10's of $k, in a stack for $25/each. Telebit trailblazer uucp modems. Apple II's. Incomprehensible test equipment.

At least old hardware gets to rust on a shelf in a warehouse for a while after its life is up. Old software just goes >poof<, and is gone...

The Ark could be in here somewhere.

December 4, 2007

There is no Building Code for software

Did you ever have a leak around a window or roof vent or patio?

Did you ever have a program with bugs?

The building code is pretty cool. Not knowing anything at all about construction I was fascinated to see the detailed specifications about what must/must not be done in various kinds of residential and commercial construction. Like requiring thicker gypsum / sheetrock inside of closets located under stairwells. Why? Because if a fire breaks out in the closet, the staircase burns, and then egress is cut off for people on the upper floor.

Another interesting one is a requirement to have vertical bars on deck railings rather than horizontal ones. Why? Because it's harder for kids to climb over the railing and fall.

There are thousands of specifications like this. Some small details, some major structural points. How windows should be properly flashed, pipes connected, electricity kept safe, the foundation secured, on and on and on.

People have been living in houses for a very long time.

Houses rot, fall down, burn down. Pipes burst. Roofs leak.

Each of the rules in the building code have arisen because some particular failure scenario happened enough that it made sense to add the rule.

It kind of creeps me out. Behind that rule about thicker lining in stairwell closets are... well, some fires that burned out stairwells. And perhaps some people who couldn't get out. It's not just a theoretical.

According to Wikipedia, the Code of Hammurabi from ancient Babylon in 1760 B.C. was the first building code:

  • If a builder builds a house for someone, and does not construct it properly, and the house which he built falls in and kills its owner, then that builder shall be put to death.
  • If it kills the son of the owner, the son of that builder shall be put to death.
  • If it kills a slave of the owner, then he shall pay, slave for slave, to the owner of the house.
  • If it ruins goods, he shall make compensation for all that has been ruined, and inasmuch as he did not construct properly this house which he built and it fell, he shall re-erect the house from his own means.
  • If a builder builds a house for someone, even though he has not yet completed it; if then the walls seem toppling, the builder must make the walls solid from his own means.

There is no building code for software. There are a lot of anecdotal proscriptions, and a ton of knowledge on the subject. But for joe the general software contractor - Jeff Atwood's "80%" programmer - sometimes the expedient is chosen over the correct. Not because they're malicious or incompetent. Just because they haven't devoted their life to studying the art. They just want to learn the trade and work it. Where's the rulebook?

In software, unless you're in medical devices, or fly-by-wire aircraft systems, you don't usually kill people with bad software. Thank goodness.

We haven't been living in software houses for thousands of years. They're each more complicated and each software system is novel. It's still a black art. And every software project is in part an R&D exercise.

So I think it's still a long time before we'll have a building code for software.

December 5, 2007

I'm shocked, shocked to hear about the secret Wikipedia cabal

A buddy asked me what I thought of the "secret wikipedia mailing list" brouhaha:

From the Register:

Controversy has erupted among the encyclopedia's core contributors, after a rogue editor revealed that the site's top administrators are using a secret insider mailing list to crackdown on perceived threats to their power.

Many suspected that such a list was in use, as the Wikipedia "ruling clique" grew increasingly concerned with banning editors for the most petty of reasons. But now that the list's existence is confirmed, the rank and file are on the verge of revolt.

He wondered if this was unique to Wikipedia, or if we'd seen this sort of thing at dmoz or topix.

The fact is that there is no way to prevent players in a social game from colluding to increase their effectiveness.

If players can coordinate their actions to get more power, they will. People are social creatures and form cliques, groups, tribes, and like to hierachically organize themselves. This consistently happens if you have any kind of extra priviledge for the senior folks -- e.g. editall or meta capability in the Open Directory. But it happens even in purely discourse-mediated systems, where parties will collude to promote / denouce agreed-upon subjects.

Sometimes what happens can feel like a virtual re-creation of the Stanford Prison Experiment.

What I pointed out to my buddy, however, was that you need to be careful before you try to architect or legislate this out of your system. Game designers know there is a careful balance between keeping long-running multiplayer systems inviting to new folks while letting experienced players continue to progress in status and power. The power is one of the main rewards in a social system. And it's going to your most loyal and productive game addicts.

The 80/20 rule is vastly over-used, but we found that it did apply in dmoz. A small group of editors did most of the work. If you remove the rewards for the power-users, to make the playing field more "democratic", you may be pissing off your best users.

Other takes from Matthew Ingram, Mashable, others.

December 10, 2007

PageRank wrecked the web

Two years later and rel=nofollow is still bugging folks.

Google needs YOUR help

It's still bugging me, too. It doesn't make any sense.

Bad linking hurts everybody

Google couldn't seriously be asking webmasters to tag which of their links were going to affect pagerank vs. the ones they'd sold. Could they?

Let's pick up all the trash in the world

That would be like asking everyone in the world to please be nice so the old algorithm will still work.

Do a random act of internet kindness

If I close my eyes and wish really hard I can bring back the golden age of the 1999 web. Back when links still indicated site quality.

THINK before you LINK

Back when spam was simpler, and G wasn't party to both sides of the transaction.

PageRank stands for PR

The toolbar pagerank display is disconnected from the real topic-sensitive pagerank used in the SERPS. Google can cut your PR in half but your SERPs don't change. It's a message, but what does it mean?

NOFOLLOW if you're PAID, or PAY the cost

Why would they want us to think that these things mattered?

If you don't toe the line, we'll ban you. You'll be sorry.

We can't actually ban the Washington Post or the Stanford Daily though. But we're going to threaten you to make you shape up.

Don't say untrue things about people

On one hand it seems an oddly utopian world view, not a pragmatic one.

Help Google by only publishing quality links

What happened to all the genius researchers building Skynet with their 1 million servers? What's all the AI for if they can't do a better job of tagging web pages than asking users to do it for them?

You mean they can't even detect TextLinkAds on a page, and have to resort to this weird business threat model instead?

The web of spam

Links used to be for human navigation.

Google made them count for money and they're ruined now.

Nofollow isn't going to put it back the way it was.

PageRank wrecked the web

Google is the cause of all of this.
and Google is going down with it.

December 13, 2007

Multi-paned search UI in testing at Google

It's cool that Google has gotten around to implementing the multi-pane search interface. Wags are saying that Google copied Ask on this, but really, it was Ask that copied A9's innovative interface. And now that Udi Manber, who built A9, is running search products at Google, it makes sense to see him testing an evolution of those ideas.

A9's interface (which was powered under the hood by Google results at the time) didn't seem to get traction when it launched. But those ideas, deployed on the Real Thing, could be a different story.

My next hope is to see some personalization come out on the results.

I have some personal skepticism that either multiple columns or p13n is a good idea. But it would be nice to see Google explore those.

December 15, 2007

Google sees own shadow, jumps overboard

Google announces "Knol"...

First-order response

Bad news for jason and mahalo! Google declares war on jimmy and wikipedia!

Some context

So Google makes an algo that puts wikipedia at the top of all the results. You search for 'hamburger', you get the encyclopedia definition of a hamburger. Riiiight... But questioning the the wisdom of this algorithmic choice is off the table.

So they say, "Whoa. Look at that site at the top of all our results. We made them that big with our traffic. We should have a site like that, and then we could be there, instead. But we'll do it right this time. Our way. And put our ads on it!"

Onebox Thinking

Ask has those nice oneboxes. You search for britney, you get her AMG profile and a little picture from better days. But that's just the AMG dataset. You can implement about 100 of those custom datasets, and then smoke and noise start to come out of your feed integration team, and you can't take in any more feeds.

Google has Oneboxes. A lot of them are programmatic. sfo to lax, chicago weather, goog, things like that. But gosh, isn't wikipedia being in the top spot for all those searches just a kind of Onebox? An informational-article Onebox? Wikipedia only has 1.5M articles, that doesn't seem like a lot. Heck, jason pumped out 26,000 in a few months with a little team. What if this were properly scaled to the web?

Google could then scale its informational oneboxes. And keep them under its control. Not have them run by some kimono-wearing guy who wants to let the community decide how the content should be edited. A guy who won't take green US dollars for ads. What's he thinking? Better not trust him. ;-)

So what's the problem

Google is optimized for one result. Position #1, the I'm Feeling Lucky button. Oneboxes fit into this goal. The programmatic ones are command-line tools. 'weather 60201'.

But Oneboxes aren't webby. Even Mahalo, with its editor-created pages, seeks to link out to the breadth of information available about a topic. To be the best hub for that topic - not the destination. Wikipedia is a destination, but by virtue of the democratic inclusion process, mostly succeeds in distilling the web's voices into an objective resource.

There are many first-order problems with the Knol plan. Paul Montgomery zeroes in on some of the moderation issues nicely. But set aside the nightmare of trying to coax a usergen-content business to produce quality output. The question is, if this did succeed, would it contribute to building the ultimate web experience that we really want?

December 18, 2007

What fraction of searches are porn?

I found a stat that claimed that 25% of internet searches were for porn. This appeared in the CS Monitor, and my guess is it came from here.

I don't see that high a fraction of porn queries in the AOL dataset though...perhaps as little as a few percent. I wonder if they were filtered? But I didn't think they were, not in that way anyhow.

About December 2007

This page contains all entries posted to Skrentablog in December 2007. They are listed from oldest to newest.

November 2007 is the previous archive.

January 2008 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33