June 1, 2009

Bingram BetaHoo - poking at a few Bing queries

I like Bing! Bing.com is live and it looks really cool. Very fast, clean UI, strong navigational results, nice extra features like the hover panes, aggressive title relevance, plus all the vertical sub-engines. People like it.

That said it's brand new and we all want to kick the tires.

Search engines are built out of a lot of layered systems. One part can be working great but be subverted by another part that has a gap. Like any product there are always bugs to be fixed and improvements to be made. So launch day isn't the final word on relevance. But it's interesting to survey a variety of results to poke around.

  • Overall the navigational results seem very strong.

  • Bing is doing aggressive title rewriting to boost perceived relevance. Google has done some of this for a while - note the title change on the same url based on the query - [skrentablog] vs. [rich skrenta].

    The "Skrenta, Rich" title came from dmoz.

    Bing is going farther. Sometimes it makes the result look better than Goog's, e.g. [san carlos art and wine fair]. But others are odd, like result #3 for [mike arrington]. That funny-looking title looks like it came from anchortext.

  • Bing's indexing of *.blogspot.com seems really limited. For instance [radish king] doesn't turn up radishking.blogspot.com. Site:blogspot.com on bing returns an estimate of just 560k results. Compared to Google (340m) and Yahoo (230m), Bing's blogspot index seems tiny. Other blogspot sites I've gone looking for are missing too. I wonder if this is some kind of rank or index penalty given the large amount of blogspot spam, or if there is some other issue with their crawl.

  • [michael arrington] vs. [mike arrington]. TechCrunch is #2 for Michael Arrington, but is way down at the bottom of the page for Mike Arrington. This seems to be the fault of the section-ized results; it's under a heading called "Mike Arrington Blog". As others have noted I'm not a big fan of sections or universal search style sections on result pages. It's unfortunate to see a strong result for the query get pushed that far down.

  • Bing, like Google, returns Dogpile and AltaVista for [search engine]. (Yahoo looks like they manually pinned a couple of results for this query.)

Overall the few bugs I've seen are relatively minor issues in the scheme of the entire product and I'm sure will eventually be addressed by the Bing engineers. It's so cool to have a powerful new engine out with interesting results. Kudos, Microsoft!

April 21, 2009

Topix passes USA Today to become #1 online site for Gannett, Tribune and McClatchy

Four years after our deal to sell a majority of Topix to the top three US newspaper companies, Topix becomes the #1 online property for Gannett, Tribune and McClatchy.

Congrats to the Topix team on the fantastic recent site growth!

April 9, 2009

blekko's ambient cluster health visualization

When you have several hundred servers in a cluster, knowing the state and health of all of them can be a challenge. Traditional pager alert systems can often either log too many events, which makes people tune them out, or they miss non-fatal but still serious server sickness, such as degraded disk/cpu/network performance or subtle application errors.

This becomes especially true when the cluster and application are designed for high availability. If the application is doing its best to hide server failures from the user, it's often not apparent when a serious problem is developing until the site fails in a more public or obvious way.

We called these "analog failures" at Topix. There was a fairly complicated chain of processing for incoming stories that had been crawled. Crawl, categorize, cluster, dedup, roboedit, push to front ends, and push to incremental search system. Once an engineer mistakenly deleted half of the sources from our crawl, and it took us a disturbingly long time to notice. The problem was that, while overall we had half as many stories on the site, most pages still had new stories coming in, so we didn't notice that anything was wrong.

Sometimes a server has a messed up failure, like its networking card starts losing 50% of its packets, but stuff is still getting through. Or a drive is in the process of failing, and its read/write rate is 10% of normal, but it hasn't failed enough to be removed from service yet. The cpu overheated and is running at a fraction of its normal speed. There seem to be limitless numbers of unusual ways that servers can fail.

At blekko, there are dozens of stats we'd ideally like to track per host:

  • How full are each of the disks?
  • Are there any SMART errors being reported from the drives?
  • Are we getting read or write errors?
  • What is the read/write throughput rate? Sometimes failures degrade the rate substantially, but the disk continues to function
  • What is the current disk read latency?
  • Is packet loss occurring to the node?
  • What is the read/write network throughput?
  • What is the cpu load?
  • How much memory is in use?
  • How much swap is being use?
  • How big is the kernel's dirty page cache?
  • What are the internal/external temperature sensors reading?
  • How many live filesystems are on the host vs. dead disks?

Others stats pertain to our cluster datastore:

  • How many buckets are on each host?
  • Is the host above or below goal for its number of buckets?
  • What is the outbound write lag from the host?
  • What is the maximum seek depth for a given path/bucket?
  • Do we have three copies of every bucket (R3)?
  • If we're not at R3, how many bucket copies are occurring?
  • For running mapjobs, what is their ETA + read/write/error rate?
  • Are the ram caches fully loaded?
  • Are we crawling/indexing, what is the rate compared with historical?

The first step is to start putting the stats you want to be able to see into a big status table. But at 175 hosts, the table is kind of long, and it's hard to spot developing problems in the middle of the table.

So we have been experimenting with mapping system stats onto different visualizations, so we can tell at a glance the overall state of hundreds of servers, and spot minor problems before they grow.

A table with 175 rows is pretty long, but you can fit 175 squares into a very small picture. This table shows overall disk usage by host. The color of the tile shows the disk usage: red is 90%, orange is 80%, yellow is 70%, blue is below 60%. Dead filesystems on the node are represented by grey bars inside the tile. The whole grid is sorted worst-to-best, so it's easy to see the fraction of hosts at a given level of usage.

Our datastore uses a series of buckets (4096 in our current map) to spread the data across the servers. Each bucket is stored three times. If we have three copies of every bucket, we're at "R3". This is the standard healthy state of the system.

Because fetch/store operations will route around failures, it's not at all apparent from the view of the application if some buckets do not have three copies, and the cluster is degraded. So we have a grid of the buckets in our system, color coded to show whether there are 0/1/2/3 copies of the bucket.

In the above picture, the set of buckets in red have only 1 copy. The yellow buckets have 2 copies, and the green have three. We have a big monitor with this display in our office, if it ever shows anything but a big green "3" folks notice and can investigate.

For variety we've experimented with other ways to show data. This display is showing the fraction of a path in our datastore which has been loaded into the ram cache. Ram cache misses will fall back to disk, so it's not necessarily apparent to the user if the ram cache isn't loaded or working. But the disk fetch is much slower than the ram cache, so it's good to know if some machines have crashed and the ram cache isn't at 100%.

Other parts of the display are standard graphs for data aggregated across all of the servers. These are super useful to spot overall load issues.

We're still experimenting with finding the best data to collect and show. But the ambient displays so far are a big win. Obvious issues are immediately visible to everyone in our offfice. And people will walk by and look at the deeper graphs and sometimes spot issues. Taking the data from being something where you would have to proactively type a cli command or click around on some web forms, to displays that engineers will stop and look at for a few minutes on their way to/from getting a coffee or soda has been big improvement in our awareness and response to cluster issues.

April 8, 2009

Bryn turned me into a muppet

March 14, 2009

The news medium has a message: "Goodbye"

Every so often there's a story about about a technophobe executive so out of touch a secretary has to print out their email every morning so they can read it on paper and dictate replies.

That's what the print newspaper is, of course. Why on earth would you print all that stuff out? Over a hundred pages, most of which you're not going to read, with the crease down the middle of the front page photo, story jumps everywhere, a carbon-footprint disaster to produce, distribute and recycle. It's absurd.

Back in 1980 newspapers were the main way that bytes flowed into people's homes. Radio and TV for audio/video, but the newspaper delivered the bytes that were read like the text-based web.

I once worked out some rough back-of-napkin estimates on the number of text bytes in the paper. It was only delivered once during the day, but if you average the bytes across the entire 24 hour period it came out to be about the rate of a 300 baud modem. The newspaper was the internet.

It was mostly one way - except for all those classified ads and the letters to the editor. It was really a lot more like AOL, since it was centrally controlled and edited.

But it did represent the sole text byte pipe into the home. And so it contained every content vertical, all in one package. National news, world news, local community sections. Little league scores and the NFL. Weather, stock tables, TV listings, home sales. Advertising, both national, local and personal. Games and political commentary and the police blotter. Everything.

Fortified by the high cost of the printing press and the limited radius of delivery trucks there was a natural local monopoly to these things. And indeed, they were a wonderful business, a so-called license to print money. Huge fortunes were made.

That's all over now of course. The subsidy that classifieds supplied for bureaus in distant cities is gone. The class of professional reporters as we know them is going to be smaller and funded differently.

I was at the TechCrunch office welcoming party last night, and was struck by how unassuming the offices were. This was the big move up, of course. They were still unpacking after moving out of Mike Arrington's house. But it was a small office with a few desks scattered around, a handful of computers. I've toured the massive AP newsroom, rebuilt in 2004 to cater to every desire of a journalist. The Reuters newsroom had pods that look like they were inspired by Norad in Wargames, with circular banks of monitors around central stations, all showing live feeds or charts from various sources. The old Mercury News offices were vast.

TechCrunch was a modest affair by comparison. So this is where it all happens..., I thought. This is what the modern business press looks like now.

Get used to it.

November 22, 2008

Detecting spam from http headers?

Greg Linden describes a paper about finding spam simply by inspecting the returned http headers:
In our proposed approach, the [crawler] only reads the response line and HTTP session headers ... then ... employs a classifier to evaluate the headers ... If the headers are classified as spam, the [crawler] closes the connection ... [and] ignores the [content] ... saving valuable bandwidth and storage.

We were able to detect 88.2% of the Web spam pages with a false positive rate of only 0.4% ... while only adding an average of 101 [microseconds] to each HTTP retrieval operation .... [and saving] an average of 15.4K of bandwidth and storage.

After running web crawls for the past year and finding all manner of spam, I have to say I'm skeptical this technique would really catch much spam on the actual web. Among the top 10 http header features they identify as spam-predictors are:

  • Accept-Ranges: bytes
  • Content-Type: text/html; charset=iso-8859-1
  • Server: Fedora
  • X-powered-by: php/4
  • 64.225.154.135

These are pretty standard-looking headers. Let's look at some actual spam though and see if we can see anything funny.

$ curl -I http://www.fancieface.com/
HTTP/1.1 200 OK
Date: Sat, 22 Nov 2008 19:13:11 GMT
Server: Apache/1.3.26 (Unix) mod_ssl/2.8.12 OpenSSL/0.9.6b
Last-Modified: Tue, 21 Oct 2008 11:51:10 GMT
ETag: "2081cc-ba62-48fdc22e"
Accept-Ranges: bytes
Content-Length: 47714
Content-Type: text/html

Very spammy site, but totally vanilla heaaders. How about some rolex watch spam:

$ curl -I http://superjewelryguide.com/300.html
HTTP/1.1 200 OK
Date: Sat, 22 Nov 2008 17:48:26 GMT
Server: Apache
X-Powered-By: PHP/5.2.6
Content-Type: text/html

Again, pretty vanilla. Plus this technique isn't going to work at all for spam hosted within trusted domains. Here's some cialis spam smeared onto a my.nbc.com page:

$ curl -I http://my.nbc.com/blogs/GaryRobinson/main/2008/10/13/cialis-cheapest-cialis-pills-here
HTTP/1.1 200 OK
Server: Apache/2.2.0 (Unix) DAV/2 PHP/5.1.6
X-Powered-By: PHP/5.1.6
Wirt: (null)
Content-Type: text/html
Expires: Sat, 22 Nov 2008 19:16:33 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Sat, 22 Nov 2008 19:16:33 GMT
Content-Length: 0
Connection: keep-alive
Set-Cookie: pers_cookie_insert_nbc.com_app1_prod_80=1572983360.20480.0000;
        expires=Sat, 22-Nov-2008 23:16:33 GMT; path=/

but very fishy headers! :-)

It's incredibly difficult to get a high quality random sample of the web. You can't factor crawler strategy bias out of the sample, and any small sample is not necessarily going to very representative.

If the researchers did find good coverage with quirky headers and even individual ip addresses, I suspect that the crawl they're using may be over-weighted in pages from a few servers that spewed out a lot of urls/virtual hosts.

November 21, 2008

Thank heaven for tax refunds

In 2000 before the dot-com meltdown I bought a few cases of french bordeaux. Even though I like bordeaux, it half-seemed like a silly purchase at the time, but when the wine arrived I was happy because the bordeaux had risen in value since I purchased it, but due to the stock market death-spiral my accounts had gone down in the meantime. win, sorta.

Unfortunately there was also a bmw 540 that I decided was too indulgent to buy and passed on. Afterward I kicked myself -- it would have been free. I would have exercised some netscape options I had to buy it. I held onto them, eventually they declined in value until they were worthless. I should have bought the car!

I saw a joke circulating at the time that beer would have yielded a better return than some stocks. The beer bottles could be returned for the 5 cent deposit, but stocks became worthless. Plus you would get to drink the beer.

Now we're going through it again, but even worse. The banker line now is that it's not the return on your capital that you should be worried about, it's the return of your capital.

I just got a state of California tax refund check. Normally it's ineffecient to pay too much withholding, essentially lending the government your money interest-free until tax time. In this case though it turned out to be a decent investment. :-|

November 14, 2008

Cold calls, cold response

Every few days cold-calling salespeople show up at our office unnannounced to pitch us on insurance, lease deals, laser toner, office supplies, voip plans, bottled water, etc.

We have an open office. So when they enter, 11 people immediately look up at them. This can apparently be somewhat intimidating, based on their flummoxed reactions. They usually ask for a business card so they can call us later. I sometimes offer them mine, since my card doesn't have a phone number on it. Then they beat a hasty retreat.

Lately we've been trying a new tactic - not acking their presence when they come in. There's no receptionist (of course), and it's not clear who they should attempt to speak with. None of us really want to listen to their pitch or take their flier anyway, so playing the game of chicken with the other folks in the office sort of emerged as a default behavior. Who will be the first to crack at their nervousness, make eye contact, and thus become the dupe left holding the flier or handing out their business card?

I almost feel sorry for them. Almost!

November 2, 2008

Lucy on Elections

It's hard being a campaign worker.
We're completely at the mercy of our candidate.
We do all the work, and the candidate gets all the credit.
We ring doorbells, and make the posters, and build up the candidate's image.
And then he says something stupid, and ruins everything we've done.

The next time I do any campaigning, it's gonna to be for myself!

      -- Lucy, You're (not) elected, Charlie Brown

October 29, 2008

Retro Conservation Advertising

The modern green/eco movement is bringing back the idea of eating local, having a garden, saving energy, etc. and pointing out the links between items (like bottled water and oil).

But we've been here before. Check out these WWI gov't posters.


"Don't waste paper - a pound of paper wasted is a pound of fuel wasted"


"Keep the home garden going"

Check out all the detailed instructions in that one. Public education indeed.

More posters...

October 23, 2008

What's up Rich

If blogging is dead it must be time to start Skrentablog up again. Apologies for letting the blog go dormant the last little while, I've had my head down in technology. Quick update: 200 servers, 11 employees, lots of code. Crawl, index, test, repeat.

We hired a naming firm to come up with a better name than 'blekko', they did a great job. Down to two candidates. Testing them.

We built a wicked cluster platform to run our stuff. It's kind of like bigtable from the top-down api view but is an integrated design, vs. the layered impedance mismatches with stuff like gfs/chubby. No masters, all swarm algos. We crawl/index/serve into structured storage. It's very fast, has integrated mapjobs, and is really easy to program on top of. I'll post more details about it in the future.

More posts to come, I promise.

May 1, 2008

blekko is hiring

blekko is building a new search engine from scratch and I'm looking to hire a few more coders.

Search is an absolutely fascinating problem to work on for a bunch of reasons. For one thing you have to scale the thing before getting the first user. You can't just start with a server or two and add more when the users come. Step 1 is to copy the internet onto your cluster. Step 2 is to analyze it..

The componentry is remarkably deep.

Search is like 7 hard problems wrapped into a stack. Distributed systems, html analytics, text analytics/semantics, anti-spam, AI/ML, frontend/UI. And scale... Apart from the sexy high end algos there are also the boring 10-year old system libraries and off-the-shelf tools that crack under stress and sometimes need a look. You open the hood and wonder how the thing ever worked in the first place...

Plus there is always something fresh and new every day mining through the vast sordidness of the many billions of pages on the web. You expect to be amazed at the endless varieties of crazy porn domains and new approaches to webspam. But there are equal horrors in the small, finding pathological charset issues, previously-undiscovered abominable server implementations, psychopathic website owners. The web is a reactive fuzz test.

I know there are some great coders out there reading this blog who would have blast working on some of the pieces here that need to get built. This is a great opportunity to join an experienced team early building a big system from the ground up. If you think you might be interested, send me an email and we can chat.

fyi our interviews always have coding tests. Primarily we are looking for folks who love to write code and are good at it. :)

How Fake Luxury Conquered the World

The legend says that once upon a time there was a General Motors. This General Motors, GM for short, had a car and a brand for every need, along the plan developed by the great Alfred Sloan prior to the Second World War. There were Chevrolets for regular folk, Pontiacs for the cautious old people (and, thanks to John Z. Delorean's development of the 1964 GTO, for angry young people as well), Buicks and Oldsmobiles for doctors and successful businessmen, and Cadillacs at the very top, for the most successful men in the land.
...
It would have stayed that way forever, but one day a mysterious yet important man at GM had a mysterious yet important idea: Executives should drive cars from their own division!

Which leads to every division of GM building their own version of the Cadillac.

Read more: How Fake Luxury Conquered The World

(thanks Bryn for the tip)

April 24, 2008

Microsoft bias in MSN search results, surprise

I was looking to see what search sites might have a particular bug that I (ahem) came across and was trying the search for the number 0 in various places. There is a pretty good Wikipedia page about zero. Zero has a rich and interesting history and there are many other potentially reasonable results.

But I was surprised to see MSN search had demoted their good results below some crappy ones from MSDN:

Lame! Falling into an inferior lex position and a lower overall relevance page to boost their own network results...give em credit for being old school. :)

...

I found my bug on Yahoo Search. I had tried a lot of smaller engines first because I didn't think a major would have this bug. You can't search for 0 on Yahoo. You can search for all the other numbers, but not 0 ...

Why?.. Because 0 is false. It suggests Yahoo is using a scripting language to front their search form, and a programmer did something like if ( $query ) rather than if ( $query ne '' ).

April 22, 2008

Hypertable architecture talk Wednesday in Palo Alto

Doug Judd will be discussing the internals and architecture of Hypertable tomorrow in Palo Alto at 6:30pm.

Hypertable is an open source, high performance, distributed database modeled after Google's Bigtable. It differs from traditional relational database technology in that the emphasis is on scalability as opposed to transaction support and table joining. Tables in Hypertable are sorted by a single primary key. However, tables can smoothly and cost-effectively scale to petabytes in size by leveraging a large cluster of commodity hardware. Hypertable is designed to run on top of an existing distributed file system such as the Hadoop DFS, GLusterFS, or the Kosmos File System (KFS). One of the top design objectives for this project has been optimum performance. To that end, the system is written almost entirely in C++, which differentiates it from other Bigtable-like efforts, such as HBase. We expect Hypertable to replace MySQL for much of Web 2.0 backend technology. In this presentation, Doug will give an architectural overview of Hypertable. He will describe some of the key design decisions and will highlight some of the places where Hypertable diverges from the system described in the Bigtable paper.

More details.

Starbucks "re" branding

It will be interesting to see how the return of the original starbucks founder Howard Schultz and the return to their orig plan and ideas turns out.

He's had a successful stunt with the system closing for 3 hours to retrain workers in how to make coffee, which generated a lot of PR.

Now the introduction of the new house blend, named after the original starbucks store. But also, surprise! - the original logo is back.

Usually logos and identities get vaguer, cleaner and more abstract as a the MBAs wash/rinse/repeat. Starbucks is going back to the gritty and vaguely obsene logo they launched with.

 

Deadprogrammer famously detailed the history of the Starbucks logo going back to a 15th century woodcut. The original logo was slightly sanitized, but each corporate revision made it more and more abstract and less recognizable as to what it actually was. My wife said "I had no idea there was even anything inside that circle, I had never looked until you pointed it out to me."

Face logos are great brands but they always seem to get watered down and more cartoony over time. This is the case with a lot of the face logos on food at the grocery store, the original versions were closer to actual faces rather than abstract logos (think chef boy r dee here.)

This happened to KFC with the colonel...he started out as realistic line drawing of Colonel Sanders with the company name - "Kentucky Fried Chicken." After the waves of rebranding stylists were done with him he was an abstract cartoon. They couldn't stop there and abbreviated the company name. You're wouldn't want to realize you're eating FRIED CHICKEN when you're at KFC after all. You probably want to be eating a healthy salad with dressing on the side. That's why you went in there, right??

I bet Dunkins Donuts wishes they could rename themselves "DD". Hmmm, maybe "empty vessel" names aren't so bad after all... :)

Interesting to think about brand identities that get going because they're a little gritty and different and personal, they don't start out whitewashed / washed out, but after getting successful they put on the bland suit. What would the AOL redesigners do to Drudge's site if they bought it?

April 16, 2008

Microsoft "hits back" at Google with re-launch of 4-year old Newsbot

The memecrowd sure has a short memory... maybe I'm just showing my age here, but still.
CNET: Microsoft hits back at Google with Live Search News
Search Engine Land: Microsoft Launches Live Search News
Search Engine Watch: Windows Live Search Offers Google News Alternative

MSN Newsbot? Anyone? From 2004:

CNET: Google News faces Microsoft rival (Jul 27, 2004)
Wash Post: Microsoft Deploys Newsbot To Track Down Headlines (Aug 1, 2004)
Geeking with Greg: MSN Newsbot review (Jul 27, 2004)

Web robot names considered, and rejected

Google's is "Googlebot"
Yahoo's is "Slurp"
Cuill's is "Twiceler"

It makes sense have a friendly robot user agent, so nervous webmasters won't ban it. You don't want to call your crawler 'sitejacker' or something.. Unfortunately my favorite candidates were:

Crawlhammer
Webraker
Lurchy
Client9

hmmm. :-|

"Oh no! It's CrawlHammer!!"

If even in your heart you hide the urls ... there it shall rake for them...

...

Does anyone know what the purpose of a '+' in front of an url in the robots user-agent is? Some sites put in the '+', others don't...

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)

Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)

Gigabot/3.0 (http://www.gigablast.com/spider.html)

April 14, 2008

Cluster map propagation in Amazon Dynamo

Dynamo is Amazon's scalable key/value storage service. The paper is a good read, but I found the way the cluster node list information was propagated in dynamo to be a little odd. The algorithm is that every 60 seconds a node will talk to another node in the cluster, chosen at random, and exchange update information. I wondered how fast a change would propagate through the cluster, so I simulated the propagation.

For a 5,000 node cluster it takes about 9 update cycles for a change to reach every other node. Since each update is on a 60 second timer, that's 9 minutes for a change to push out.

I didn't do a very sophtisticated time model..plus there is random start and all that. So maybe in practice it's a little different. But 9 minutes seems like a long time to propagate a host change out to the rest of the cluster. Maybe I mis-interpreted what they're doing?

I recall some confusion about whether Dynamo was actually providing SimpleDB, or if they were two separate software systems. Does anyone know if this was resolved?

April 9, 2008

AppEngine - Web Hypercard, finally

Google's AppEngine is being compared to Amazon's EC2/S3. But Google deserves credit here for coming up with a pretty differently-positioned product. There may be overlap for many users of course, but it's really operating at a whole different level of the stack.

Folks that want/need more control over the environment, ability to manually manage their own machine instances, run code other than python, etc. will stay with EC2. EC2 is a step above RackSpace.

But rather than thinking of AppEngine as a step above EC2, instead I think of it somewhere around Myspace. Or "Ning 1.0", as Zoho points out.

In the beginning was GeoCities... No, even further back, in the beginning was Hypercard. Hypercard was a pre-web application for Macs that let you design a "stack" of pages - a website on a floppy, really. Popular stacks got traded far and wide. Hypercard stacks existed for every imaginable purpose - "Time Table of History", games, crossword puzzles, the Bible, etc.

The thing about Hypercard was that it wasn't just static text and images like base html. It had a scripting language, a database, and the Apple UI built-in, so you could create mini applications.

It feels like the web has been trying to claw its way back to the simple utility of Hypercard ever since Mosaic. GeoCities was the first massive-uptake anyone-can-build-here website haven. But it was all static html.

Sure, you can paste javascript widgets onto your page, and have content driven by external sites. But to make the website a first-class object - on functional partity with a "real" website - it needs to be backed by a database and programmability. But setting up mysql, renting machine space, configuring linux, programming all the boilerplate, not to mention the scalability issues if your site gets popular -- this is all a big hurdle.

So to hide all those details behind a platform that's easy to get started with, and lower the bar to entry to writing public application websites... Well that's a big deal. Hat's off to Google for bringing this to market.

I'm not alone...somewhat similar thoughts from Nate Westheimer...

Categories