<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
   <channel>
      <title>Skrentablog</title>
      <link>http://www.skrenta.com/</link>
      <description></description>
      <language>en</language>
      <copyright>Copyright 2008</copyright>
      <lastBuildDate>Thu, 01 May 2008 12:11:09 -0800</lastBuildDate>
      <generator>http://www.sixapart.com/movabletype/</generator>
      <docs>http://blogs.law.harvard.edu/tech/rss</docs> 

            <item>
         <title>blekko is hiring</title>
         <description><![CDATA[blekko is building a new search engine from scratch and I'm looking to hire a few more coders.

<p>

Search is an absolutely fascinating problem to work on for a bunch of reasons.  For one thing you have to scale the thing before getting the first user.  You can't just start with a server or two and add more when the users come.  Step 1 is to copy the internet onto your cluster.  Step 2 is to analyze it..
<p>

The componentry is remarkably deep.

<p>
Search is like 7 hard problems wrapped into a stack.  Distributed systems, html analytics, text analytics/semantics, anti-spam, AI/ML, frontend/UI.  And scale... Apart from the sexy high end algos there are also the boring 10-year old system libraries and off-the-shelf tools that crack under stress and sometimes need a look.  You open the hood and wonder how the thing ever worked in the first place...
<p>

Plus there is always something fresh and new every day mining through the vast sordidness of the many billions of pages on the web.  You expect to be amazed at the endless varieties of crazy porn domains and new approaches to webspam.  But there are equal horrors in the small, finding pathological charset issues, previously-undiscovered abominable server implementations, psychopathic website owners.  The web is a reactive <a href="http://www.mattcutts.com/blog/the-web-is-a-fuzz-test-patch-your-browser-and-your-web-server/">fuzz test</a>.

<p>

I know there are some great coders out there reading this blog who would have blast working on some of the pieces here that need to get built.  This is a great opportunity to join an experienced team early building a big system from the ground up.  If you think you might be interested, send me an email and we can chat.
<p>

fyi our interviews always have coding tests.  Primarily we are looking for folks who love to write code and are good at it.  :)

]]></description>
         <link>http://www.skrenta.com/2008/05/blekko_is_hiring.html</link>
         <guid>http://www.skrenta.com/2008/05/blekko_is_hiring.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Thu, 01 May 2008 12:11:09 -0800</pubDate>
      </item>
            <item>
         <title>How Fake Luxury Conquered the World</title>
         <description><![CDATA[<blockquote>
The legend says that once upon a time there was a General Motors. This General Motors, GM for short, had a car and a brand for every need, along the plan developed by the great Alfred Sloan prior to the Second World War. There were Chevrolets for regular folk, Pontiacs for the cautious old people (and, thanks to John Z. Delorean's development of the 1964 GTO, for angry young people as well), Buicks and Oldsmobiles for doctors and successful businessmen, and Cadillacs at the very top, for the most successful men in the land.<br>
    ...
<br>
    It would have stayed that way forever, but one day a mysterious yet important man at GM had a mysterious yet important idea: <b><i>Executives should drive cars from their own division!
</b></i>
</blockquote>
<p>

Which leads to every division of GM building their own version of the Cadillac.

<p>
Read more: <a href="http://www.speedsportlife.com/2008/04/29/avoidable-contact-11-how-fake-luxury-conquered-the-world/
">How Fake Luxury Conquered The World</a>
<p>
(thanks Bryn for the tip)



]]></description>
         <link>http://www.skrenta.com/2008/05/how_fake_luxury_conquered_the.html</link>
         <guid>http://www.skrenta.com/2008/05/how_fake_luxury_conquered_the.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Thu, 01 May 2008 11:19:55 -0800</pubDate>
      </item>
            <item>
         <title>Microsoft bias in MSN search results, surprise</title>
         <description><![CDATA[I was looking to see what search sites might 
have a particular bug that I (ahem) came across and 
was trying the search for the number 0 in various
places.  There is a pretty good <a
href="http://en.wikipedia.org/wiki/0_(number)">Wikipedia
page</a> about zero.  Zero has a rich and interesting
history and there are many other potentially
reasonable results.

<p>

But I was surprised to see MSN search had demoted their good results below
some crappy ones from MSDN:
<p>
<img src="/images/msn-0.png" width=450>
<p>
Lame!  Falling into an inferior lex position and a 
lower overall relevance page to boost their own network
results...give em credit for being old school.  :)

<p>
...
<p>

I found my bug on Yahoo Search.  I had tried a lot of smaller
engines first because I didn't think a major would have 
this bug.  <b>You can't search for 0 on Yahoo.</b> You
can search for all the other numbers, but not 0 ...

<p>

Why?..  Because 0 is <i>false</i>.  It suggests Yahoo is using a scripting language to front
their search form, and a programmer did something like <code>if ( $query )</code> rather than <code>if ( $query ne '' )</code>.

<p>
]]></description>
         <link>http://www.skrenta.com/2008/04/microsoft_bias_in_msn_search_r.html</link>
         <guid>http://www.skrenta.com/2008/04/microsoft_bias_in_msn_search_r.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Thu, 24 Apr 2008 07:45:00 -0800</pubDate>
      </item>
            <item>
         <title>Hypertable architecture talk Wednesday in Palo Alto</title>
         <description><![CDATA[Doug Judd will be discussing the internals and architecture of Hypertable tomorrow in Palo Alto at 6:30pm. 
<p>
<blockquote><i>
Hypertable is an open source, high performance, distributed database modeled after Google's Bigtable. It differs from traditional relational database technology in that the emphasis is on scalability as opposed to transaction support and table joining. Tables in Hypertable are sorted by a single primary key. However, tables can smoothly and cost-effectively scale to petabytes in size by leveraging a large cluster of commodity hardware. Hypertable is designed to run on top of an existing distributed file system such as the Hadoop DFS, GLusterFS, or the Kosmos File System (KFS). One of the top design objectives for this project has been optimum performance. To that end, the system is written almost entirely in C++, which differentiates it from other Bigtable-like efforts, such as HBase. We expect Hypertable to replace MySQL for much of Web 2.0 backend technology. In this presentation, Doug will give an architectural overview of Hypertable. He will describe some of the key design decisions and will highlight some of the places where Hypertable diverges from the system described in the Bigtable paper.
</i></blockquote>
<p>
<a href="http://www.zvents.com/palo-alto-ca/events/show/81854980-sdforum-software-architecture-modeling-event-architecting-hypertable">More details</a>.

]]></description>
         <link>http://www.skrenta.com/2008/04/hypertable_architecture_talk_w.html</link>
         <guid>http://www.skrenta.com/2008/04/hypertable_architecture_talk_w.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Tue, 22 Apr 2008 12:51:56 -0800</pubDate>
      </item>
            <item>
         <title>Starbucks &quot;re&quot; branding</title>
         <description><![CDATA[It will be interesting to see how the <a href="http://www.businessweek.com/magazine/content/08_16/b4080000943927.htm">return</a> of the original starbucks founder Howard Schultz and the return to their orig plan and ideas turns out.   
<p>

He's had a successful stunt with the system closing for 3 hours to retrain workers in how to make coffee, which generated a lot of PR.  <p>

Now the introduction of the new house blend, named after the original starbucks store.  But also, surprise! - <a href="http://www.78west.com/wordpress/?p=249">the original logo is back</a>.  
<p>

Usually logos and identities get vaguer, cleaner and more abstract as a the MBAs wash/rinse/repeat.  Starbucks is going back to the gritty and vaguely obsene logo they launched with.
<p>
<center>
<img src="/images/starbucks-0.jpg" width=110>&nbsp;
<img src="/images/starbucks-1.jpg" width=110>
<img src="/images/starbucks-2.jpg" width=110>
<img src="/images/starbucks-3.jpg" width=110>
</center>
<p>

<a href="http://www.deadprogrammer.com/starbucks-logo-mermaid">Deadprogrammer</a> famously detailed the history of the Starbucks logo going back to a 15th century woodcut.  The original logo was slightly sanitized, but each corporate revision made it more and more abstract and less recognizable as to what it actually was.  My wife said "I had no idea there was even anything inside that circle, I had never looked until you pointed it out to me."

<p>

Face logos are great brands but they always seem to get watered down and more cartoony over time.  This is the case with a lot of the face logos on food at the grocery store, the original versions were closer to actual faces rather than abstract logos (think chef boy r dee here.)

<p>
<img src="/images/kfc-1.jpg" width=100 align=right>

This happened to KFC with the colonel...he started out as realistic line drawing of Colonel Sanders with the company name - "Kentucky Fried Chicken."  After the waves of rebranding stylists were done with him he was an abstract cartoon.  They couldn't stop there and abbreviated the company name.  You're wouldn't want to realize you're eating FRIED CHICKEN when you're at KFC after all.  You probably want to be eating a healthy salad with dressing on the side.  <a href="http://ries.typepad.com/ries_blog/2008/04/hand-me-a-napki.html">That's why you went in there</a>, right??
<p>
I bet Dunkins Donuts wishes they could rename themselves "DD".  Hmmm, maybe "empty vessel" names aren't so bad after all...  :)
<p>

Interesting to think about brand identities that get going because they're a little gritty and different and personal, they don't start out whitewashed / washed out, but after getting successful they put on the bland suit.  What would the AOL redesigners do to Drudge's site if they bought it?

]]></description>
         <link>http://www.skrenta.com/2008/04/starbucks_re_branding.html</link>
         <guid>http://www.skrenta.com/2008/04/starbucks_re_branding.html</guid>
        
        
         <pubDate>Tue, 22 Apr 2008 08:58:52 -0800</pubDate>
      </item>
            <item>
         <title>Microsoft &quot;hits back&quot; at Google with re-launch of 4-year old Newsbot</title>
         <description><![CDATA[The memecrowd sure has a short memory... maybe I'm just showing my age here, but still.

<blockquote><i>
<a href="http://www.news.com/8301-10784_3-9919801-7.html">CNET: Microsoft hits back at Google with Live Search News</a><br>
<a href="http://searchengineland.com/080416-091713.php">Search Engine Land: Microsoft Launches Live Search News</a><br>
<a href="http://blog.searchenginewatch.com/blog/080416-110258">Search Engine Watch: Windows Live Search Offers Google News Alternative</a><br>
</i></blockquote>

<p>

MSN Newsbot?  Anyone?  From 2004:

<p>

<blockquote><i>
<a href="http://www.news.com/Google-News-faces-Microsoft-rival/2100-1025_3-5284216.html?tag=nw.2">CNET: Google News faces Microsoft rival</a> (Jul 27, 2004)<br>
<a href="http://www.washingtonpost.com/wp-dyn/articles/A29430-2004Jul31.html">Wash Post: Microsoft Deploys Newsbot To Track Down Headlines</a> (Aug 1, 2004)<br>
<a href="http://glinden.blogspot.com/2004/07/msn-newsbot-review.html">Geeking with Greg: MSN Newsbot review</a> (Jul 27, 2004)<br>
</i></blockquote>
]]></description>
         <link>http://www.skrenta.com/2008/04/microsoft_hits_back_at_google.html</link>
         <guid>http://www.skrenta.com/2008/04/microsoft_hits_back_at_google.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Wed, 16 Apr 2008 10:33:03 -0800</pubDate>
      </item>
            <item>
         <title>Web robot names considered, and rejected</title>
         <description><![CDATA[Google's is "Googlebot"<br>
Yahoo's is "Slurp"<br>
Cuill's is "Twiceler"<br>
<p>

It makes sense have a friendly robot user agent, so nervous webmasters won't ban it.  You don't want to call your crawler 'sitejacker' or something..   Unfortunately my favorite candidates were:

<p>
<blockquote>
    Crawlhammer<br>
    Webraker<br>
    Lurchy<br>
    Client9<br>
</blockquote>
<p>
hmmm.  :-|
<p>
<i>
"Oh no!  It's CrawlHammer!!"
<p>
If even in your heart you hide the urls ... there it shall rake for them...
</i>

<p>
...

<p>

Does anyone know what the purpose of a '+' in front of an url in the robots
user-agent is?  Some sites put in the '+', others don't...
<p>

<blockquote><small>
Mozilla/5.0 (compatible; Googlebot/2.1; <b>+</b>http://www.google.com/bot.html)
<p>
Mozilla/5.0 (compatible; Ask Jeeves/Teoma; <b>+</b>http://about.ask.com/en/docs/about/webmasters.shtml)
<p>
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
<p>
Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)
<p>
Gigabot/3.0 (http://www.gigablast.com/spider.html)
</small></blockquote>

]]></description>
         <link>http://www.skrenta.com/2008/04/web_robot_names_considered_and.html</link>
         <guid>http://www.skrenta.com/2008/04/web_robot_names_considered_and.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Wed, 16 Apr 2008 09:29:00 -0800</pubDate>
      </item>
            <item>
         <title>Cluster map propagation in Amazon Dynamo</title>
         <description><![CDATA[Dynamo is Amazon's <a 
href="http://alexiskold.wordpress.com/2007/10/31/amazon-dynamo-the-next-generation-of-virtual-distributed-storage/">scalable
key/value storage service</a>.  The <a 
href="http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf">paper</a>
is a good read, but I found the way the cluster node list
information was propagated in dynamo to be a little odd.
The algorithm is that every 60 seconds a node will talk
to another node in the cluster, chosen at random, and
exchange update information.  I wondered how fast a change
would propagate through the cluster, so I simulated the 
propagation.

<p>

For a 5,000 node cluster it takes about 9 update cycles
for a change to reach every other node.  Since each update
is on a 60 second timer, that's 9 minutes for a change to
push out.

<p>

I didn't do a very sophtisticated time model..plus there
is random start and all that.  So maybe in practice it's
a little different.  But 9 minutes seems like a long time
to propagate a host change out to the rest of the cluster.
Maybe I mis-interpreted what they're doing?

<p>

I recall some confusion about whether Dynamo was actually
providing SimpleDB, or if they were two separate software
systems.  Does anyone know if this was resolved?
]]></description>
         <link>http://www.skrenta.com/2008/04/cluster_map_propagation_in_ama.html</link>
         <guid>http://www.skrenta.com/2008/04/cluster_map_propagation_in_ama.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Mon, 14 Apr 2008 10:34:19 -0800</pubDate>
      </item>
            <item>
         <title>AppEngine - Web Hypercard, finally</title>
         <description><![CDATA[Google's AppEngine is being compared to Amazon's EC2/S3.  But
Google deserves credit here for coming up with a pretty
differently-positioned product.  There may be overlap for
many users of course, but it's really operating at a whole
different level of the stack.
<p>
Folks that want/need more control over the environment,
ability to manually manage their own machine instances,
run code other than python, etc. will stay with EC2.
EC2 is a step above RackSpace.
<p>
But rather than thinking of AppEngine as a step above 
EC2, instead I think of it somewhere around 
Myspace.  Or "Ning 1.0", <a href="http://blogs.zoho.com/uncategorized/ning-10-was-too-early/">as Zoho points out</a>.

<p>
In the beginning was GeoCities...  No, even further back, in the beginning was <a href="http://www.wired.com/gadgets/mac/commentary/cultofmac/2002/08/54365">Hypercard</a>.  Hypercard was a pre-web application for Macs that let you design a "stack" of pages - a website on a floppy, really.  Popular stacks got traded far and wide.  Hypercard stacks existed for every imaginable purpose - "Time Table of History", games, crossword puzzles, the Bible, etc.
<p>
The thing about Hypercard was that it wasn't just static text and images like base html.  It had a scripting language, a database, and the Apple UI built-in, so you could create mini applications.
<p>
It feels like the web has been trying to claw its way back to the simple utility of Hypercard ever since Mosaic.  GeoCities was the first massive-uptake anyone-can-build-here website haven.  But it was all static html.
<p>
Sure, you can paste javascript widgets onto your page, and have content driven by external sites.  But to make the website a first-class object - on functional partity with a "real" website - it needs to be backed by a database and programmability.  But setting up mysql, renting machine space, configuring linux, programming all the boilerplate, not to mention the scalability issues if your site gets popular -- this is all a big hurdle.
<p>
So to hide all those details behind a platform that's easy to get started with, and lower the bar to entry to writing public application websites...  Well that's a big deal.  Hat's off to Google for bringing this to market.
<p>
I'm not alone...somewhat similar thoughts from <a href="http://www.alleyinsider.com/2008/4/google_s_appengine_aiming_at_facebook_not_google">Nate Westheimer</a>...


]]></description>
         <link>http://www.skrenta.com/2008/04/appengine_web_hypercard_finall.html</link>
         <guid>http://www.skrenta.com/2008/04/appengine_web_hypercard_finall.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Wed, 09 Apr 2008 12:10:30 -0800</pubDate>
      </item>
            <item>
         <title>Cuill is banned on 10,000 sites</title>
         <description><![CDATA[Be careful while you debug your crawler...

<p>

Webmasters these days get <i>very</i> touchy about letting
new spiders walk all over their sites.  There are so
many scraper bots, email harvesters, exploit probers,
students running Nutch on gigabit university pipes, and
other ill-behaved new search bots that some site owners <a
href="http://www.webmasterworld.com/search_engine_spiders/">nervously
huddle</a> in <a href="http://www.forumpostersunion.com/forumdisplay.php?s=0bb259e0d87a0988723de8e48aaf6b91&f=167">forum
bunkers</a> <a href="http://forums.seochat.com/search-engine-spiders-27/">anxiously
scanning</a> their logs for suspect new vistors, so they
can quickly issue bot and ip bans.

<p>

<a href="http://www.cuill.com/">Cuill</a>, the
search startup from ex-googlers anticipated to
launch soon seems to have run a rather high rate
crawl when they were getting started that generated
a large number of robots.txt bans.  Here is a <a
href="http://www.skrenta.com/cuill-bans.html">list</a> of sites which have banned Cuill's user-agent "Twiceler".

<p>

A well-behaved crawler needs to follow a set of
loosely-defined behaviors to be 'polite' - don't crawl
a site too fast, don't crawl any single IP address too
fast, don't pull too much bandwidth from small sites by
e.g. downloading tons of full res media that will never
be indexed, meticulously obey robots.txt, identify itself
with user-agent string that points to a detailed web page
explaining the purpose of the bot, etc.

<p>

Apart from the widely-recongnized challenges to building a
new search engine, sites like del.icio.us and compete.com
that ban all new robots aside from the big 4 (Google,
Yahoo, MSN and Ask) make it that much harder for a new
entrant to gain a footing.  However the web is so bloody
vast that even tens of thousands of site bans are unlikely
to make a significant impact in the aggregate perceived
quality of a major new engine.

<p>
My initial take was that this had to be annoying for
Cuill.  As a crawler author, I can attest that getting
each new site rejection personally hurts.  :)  But now I'm not so sure.  Looking
over the list, aside from a few major sites like Yelp,
you could argue that getting all the forum seo's
to robots exclude your new engine might actually help
improve your index quality.  Perhaps a Cuill robots ban
is a quality signal?  :)

]]></description>
         <link>http://www.skrenta.com/2008/04/cuill_is_banned_on_10000_sites.html</link>
         <guid>http://www.skrenta.com/2008/04/cuill_is_banned_on_10000_sites.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Tue, 08 Apr 2008 08:28:40 -0800</pubDate>
      </item>
            <item>
         <title>Did Powerset outsource their crawl?</title>
         <description><![CDATA[I've been seeing Zermelo, Powerset's crawler hitting my pages.  Sort-of:

<p>

<blockquote><small>
    ec2-67-202-8-249.compute-1.amazonaws.com - - [28/Mar/2008:23:31:06 -0700] "GET /2006/12/scale_limits_design.html HTTP/1.0" 200 11526
    "<a href="http://www.skrenta.com/2006/12/i_took_a_ukulele_lesson_once.html">http://www.skrenta.com/2006/12/i_took_a_ukulele_lesson_once.html</a>"
    "zermelo Mozilla/5.0 compatible; heritrix/1.12.1 (+<a href="http://www.powerset.com/">http://www.powerset.com</a>) [email:crawl@powerset.com,email:paul@page-store.com]"<br>
</small></blockquote>

<p>

They're using the open-source Heritrix crawler, running out of Amazon Web Services.  But who is <a href="http://www.page-store.com/">page-store.com</a>?  From their site:

<p>

<blockquote><i>

Vertical search sites are relatively costly to operate. A single vertical search engine may need to sweep all or a large part of the web selecting the pages pertinent to a small set of topics. Startup and operating costs are proportional to the input page set size, but revenue may be only proportional to the size of the selected subset.
<p>
Page-store positions itself as a web wholesaler, supplying page and link information to vertical search engine companies on a per-use basis. The effect is to level the playing field between vertical search and general horizontal internet search.
<p>
Page-store can provide
<p>
<ul>
    <li>selected page feeds based on deep web crawls
    <li>page metadata
    <li>black-box filters
    <li>anchor text results
    <li>link information
</ul>
</i>
</blockquote>

<p>

Did Powerset outsource their crawl?

]]></description>
         <link>http://www.skrenta.com/2008/04/powerset_is_crawling.html</link>
         <guid>http://www.skrenta.com/2008/04/powerset_is_crawling.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Mon, 07 Apr 2008 08:55:53 -0800</pubDate>
      </item>
            <item>
         <title></title>
         <description><![CDATA[NFS server %s not responding still trying
<p>
:)
]]></description>
         <link>http://www.skrenta.com/2008/03/nfs_server_s_not_responding.html</link>
         <guid>http://www.skrenta.com/2008/03/nfs_server_s_not_responding.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Wed, 12 Mar 2008 22:27:27 -0800</pubDate>
      </item>
            <item>
         <title>Who will stop Google from going to 90% market share?</title>
         <description><![CDATA[Jason <a href="http://www.calacanis.com/2008/03/06/google-will-have-90-search-market-share-in-the-us-one-year-from/">predicts</a>
Google going to 90% market share..  He makes a solid
argument and covers the bases.  Referred traffic today
suggests Google is at about 85%.  Ask just <a href="http://searchengineland.com/080305-095826.php">quit the game</a>,
msn/yahoo put themselves into a tarpit.  So the field
is Google's...

<p>

The only thing that can change this are new players.
A string of uninteresting search attempts and lackluster
competition have convinced people that it's impossible to 
stop Google's ascent.

<p>

Google may have a network effect on ads, but the switching
costs for the search app itself are small.  Easier than
switching free email providers.  It's just another content
site, and users are willing to try new search engines.
There just haven't been any interesting new ones to try 
in a long time.

<p>

I was hopeful that Wikia would launch something interesting
and break the n-game losing streak of the upstarts, but sadly
it was another shallow effort.

<p>

I'm rooting for Cuill next.  They have a very credible
team.  Anna built the current version of Google, and now
she's working on the next gen.  If they launch something
interesting in any dimension, they'll show the market that
you don't need a million servers and half of the phd's in
the field to build a search app.  It takes 20 people and
$5M of hardware...if you know what you're doing.

<p>
]]></description>
         <link>http://www.skrenta.com/2008/03/who_will_stop_google_from_goin.html</link>
         <guid>http://www.skrenta.com/2008/03/who_will_stop_google_from_goin.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Thu, 06 Mar 2008 09:53:45 -0800</pubDate>
      </item>
            <item>
         <title>The real reason Google&apos;s clicks are flat</title>
         <description><![CDATA[From <a href="http://seoblackhat.com/2008/02/27/why-comscore-google-clicks-flat/">SEO Black Hat</a>:
<p>

<blockquote>

Google reduced the clickable area on Adsense text ads ... Before, a user could click anywhere on the ad and be brought to the
destination.  After the changes, users have to click on something that looks like a hyperlink.
<p>
<blockquote>
<i>"The CTR on text ads declined about 60% in the last 2 months with Googles changes, Image ads on the other hand stayed the same."</i><br>
- <a href="http://plentyoffish.wordpress.com/2008/01/04/how-to-advertise-on-adsenseplentyoffish/">January 4th, 2008 Marcus of Plentyoffish.com</a>
</blockquote>
<p>
4 months later, that little back and forth in the Google Rec Room shaved about $85 Billion (with a <b>B</b>) in market capitalization.
<p>But it wasn't as stupid an idea as it might seem. You see, Adsense works in a Quasi-market place environment. The market will bid up the cost per click once the adjustment for accidental clicks is readjusted. Right now, marketers should be getting a better value per click as a higher percentage of the clicks are "real" or intentional. That will lead to higher bids per click and ultimately should be close to a break even for GOOGs bottom line.
<p>
Is the Sky Really Falling?
<p>
The problem is that in the interim, GOOG gives almost not Guidance to the stock market. Mutual Fund types are really too thick to grasp exactly what's going on, so they think that this "slowing" in the growth has to do with the potential recession effecting GOOG.
<p>
Meanwhile, the real story is that Online Advertising Spending will continue to grow at about 30% per year for at least the next 3 years and GOOG is poised to take a disproportionate amount of that growth even if nothing else they do is even marginally successful.

</blockquote>

<p>

]]></description>
         <link>http://www.skrenta.com/2008/02/the_real_reason_googles_clicks.html</link>
         <guid>http://www.skrenta.com/2008/02/the_real_reason_googles_clicks.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Wed, 27 Feb 2008 14:14:45 -0800</pubDate>
      </item>
            <item>
         <title>Lamport&apos;s Bakery Algorithm</title>
         <description><![CDATA[<img src="/images/leslie-lamport.gif" width=150 height=200 align=right>

    <blockquote><i>
This paper describes the bakery algorithm for implementing mutual exclusion.  I have invented many concurrent algorithms.  I feel that I did not invent the bakery algorithm, I discovered it.  Like all shared-memory synchronization algorithms, the bakery algorithm requires that one process be able to read a word of memory while another process is writing it.  (Each memory location is written by only one process, so concurrent writing never occurs.)  Unlike any previous algorithm, and almost all subsequent algorithms, the bakery algorithm works regardless of what value is obtained by a read that overlaps a write.  If the write changes the value from 0 to 1, a concurrent read could obtain the value 7456 (assuming that 7456 is a value that could be in the memory location).  The algorithm still works.  I didn't try to devise an algorithm with this property.  I discovered that the bakery algorithm had this property after writing a proof of its correctness and noticing that the proof did not depend on what value is returned by a read that overlaps a write. 
 <p>
I don't know how many people realize how remarkable this algorithm is.  Perhaps the person who realized it better than anyone is Anatol Holt, a former colleague at Massachusetts Computer Associates.  When I showed him the algorithm and its proof and pointed out its amazing property, he was shocked.  He refused to believe it could be true.  He could find nothing wrong with my proof, but he was certain there must be a flaw.  He left that night determined to find it.  I don't know when he finally reconciled himself to the algorithm's correctness.
    <p>
    ...
    <p>
    What is significant about the bakery algorithm is that it implements mutual exclusion without relying on any lower-level mutual exclusion.  Assuming that reads and writes of a memory location are atomic actions, as previous mutual exclusion algorithms had done, is tantamount to assuming mutually exclusive access to the location.  So a mutual exclusion algorithm that assumes atomics reads and writes is assuming lower-level mutual exclusion.  Such an algorithm cannot really be said to solve the mutual exclusion problem.  Before the bakery algorithm, people believed that the mutual exclusion problem was unsolvable--that you could implement mutual exclusion only by using lower-level mutual exclusion.  Brinch Hansen said exactly this in a 1972 paper.  Many people apparently still believe it.    <p>    ...
    <p>
    <b>For a couple of years after my discovery of the bakery algorithm, everything I learned about concurrency came from studying it.</b>
    ...  The bakery algorithm marked the beginning of my study of distributed algorithms.<br>
&nbsp; &nbsp; -- <a href="http://research.microsoft.com/users/lamport/pubs/pubs.html#bakery
">Leslie Lamport</a>
    </i></blockquote>

<p>

I find this story fascinating.  Lamport has invented a
bunch of cool algorithms.  But here he describes having
"discovered" the <a href="http://en.wikipedia.org/wiki/Lamport's_bakery_algorithm">Bakery algorithm</a>, and then spent years
studying the algorithm that he had written afterwards.

<p>

How many of us find a solution to a problem, and then spend
years studying the solution, learning from it?  Actually I
think I've learned more from studying bugs in my code than
algorithms.  If I could just avoid ever coding any bugs...

<p>

Lamport has done <a href="http://research.microsoft.com/users/lamport/pubs/pubs.html
">a bunch of 
other stuff</a>, including inventing <a
href="http://en.wikipedia.org/wiki/Paxos_algorithm">Paxos</a>,
the distributed consensus algorithm behind 
google's distributed lock manager <a
href="http://research.google.com/archive/chubby-osdi06.pdf">Chubby</a>.
<p>

]]></description>
         <link>http://www.skrenta.com/2008/02/the_bakery_algorithm.html</link>
         <guid>http://www.skrenta.com/2008/02/the_bakery_algorithm.html</guid>
                  <category domain="http://www.sixapart.com/ns/types#category">main</category>
        
        
         <pubDate>Fri, 22 Feb 2008 11:06:02 -0800</pubDate>
      </item>
      
   </channel>
</rss>
