« Multi-paned search UI in testing at Google | Main | What fraction of searches are porn? »

Google sees own shadow, jumps overboard

Google announces "Knol"...

First-order response

Bad news for jason and mahalo! Google declares war on jimmy and wikipedia!

Some context

So Google makes an algo that puts wikipedia at the top of all the results. You search for 'hamburger', you get the encyclopedia definition of a hamburger. Riiiight... But questioning the the wisdom of this algorithmic choice is off the table.

So they say, "Whoa. Look at that site at the top of all our results. We made them that big with our traffic. We should have a site like that, and then we could be there, instead. But we'll do it right this time. Our way. And put our ads on it!"

Onebox Thinking

Ask has those nice oneboxes. You search for britney, you get her AMG profile and a little picture from better days. But that's just the AMG dataset. You can implement about 100 of those custom datasets, and then smoke and noise start to come out of your feed integration team, and you can't take in any more feeds.

Google has Oneboxes. A lot of them are programmatic. sfo to lax, chicago weather, goog, things like that. But gosh, isn't wikipedia being in the top spot for all those searches just a kind of Onebox? An informational-article Onebox? Wikipedia only has 1.5M articles, that doesn't seem like a lot. Heck, jason pumped out 26,000 in a few months with a little team. What if this were properly scaled to the web?

Google could then scale its informational oneboxes. And keep them under its control. Not have them run by some kimono-wearing guy who wants to let the community decide how the content should be edited. A guy who won't take green US dollars for ads. What's he thinking? Better not trust him. ;-)

So what's the problem

Google is optimized for one result. Position #1, the I'm Feeling Lucky button. Oneboxes fit into this goal. The programmatic ones are command-line tools. 'weather 60201'.

But Oneboxes aren't webby. Even Mahalo, with its editor-created pages, seeks to link out to the breadth of information available about a topic. To be the best hub for that topic - not the destination. Wikipedia is a destination, but by virtue of the democratic inclusion process, mostly succeeds in distilling the web's voices into an objective resource.

There are many first-order problems with the Knol plan. Paul Montgomery zeroes in on some of the moderation issues nicely. But set aside the nightmare of trying to coax a usergen-content business to produce quality output. The question is, if this did succeed, would it contribute to building the ultimate web experience that we really want?

Comments (3)

The problem, as Danny Sullivan pointed out in a comment on paid links, is that Google is vulnerable to people boxing it out with robots.txt etc. Jimmy controls one of the top properties online. Suppose he makes Wikipedia only searchable inside his new engine and blocks googlebot, slurp and msn's bot (what's it's name, anyways?). Well, as per G's own results, the relevance will suck. Imagine if Sphinn, SEL, SEW, SERoundtable and the top 20 - 30 sites in search block spiders? They're F***ED.


One of the reasons that Wikipedia is so popular is that it does a good job at synthesizing information into knowledge. Computers are great at basic information retrieval, but they suck at synthesizing knowledge. Humans are great at this.

A search problem I'd love to see someone solve, especially for complex queries, is help me figure out where to go ask questions for certain types of information.

I know from personal experience that the best sources for electronics discussions is AVS Forums, for in-depth knowledge of my car is Acruazine, for travel information its Flyertalk.

For that "deep dish pizza at o'hare" query that we discussed, I got the right answer by posting it in the United Airlines forum at FlyerTalk.

Pointing users at such forums would be very helpful. Sometimes you'll find these in SERPs if you hit the right keywords, but often you won't.

You should be able to algorithmically determine that the United forum at FlyerTalk is where experts on O'Hare hang out. This would require learning the structures of the forums, but most use standard packages and templates.

If I was Google I would tweak the algo for big sites like Wiki so that their link domain was only worth a fraction of its strength, and counted it mainly by the pages links. This would fix a lot of problems with Google, local SERPs, and companies have pages for every city.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


This page contains a single entry from the blog posted on December 15, 2007 8:28 AM.

The previous post in this blog was Multi-paned search UI in testing at Google.

The next post in this blog is What fraction of searches are porn?.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33