98% spam

Absolutely horrifying email spam stats from our new vp ops at topix.

We receive 25,000 mail connections per day; each connection is an attempt by some machine to send us mail. Of those 25,000 connections we are able to reject nearly 80% of them outright since the IP address of the originating machine is registered as a known spam offender. Using various additional checks and methods we are able to reject about 98% of all incoming connections before they even have a chance to send us message content. Of the 2% that do make it to the point where they send us a complete mail message we are able to reject 25% of those as containing spam, a virus, or an unsafe attachment type. That means that in the end only 1.5% of all attempted mail connections actually result in delivered mail.

That's push email. Consider the day in our near future when 98% of the http fetchable web is spam. Auto-generated text, on-the-fly scraper-reconstituters, and so forth.

The bright side: web spam is an evolutionary force that pushes relevance innovations such as trustrank forward. Spam created the market opportunity for Google, when Altavista succumbed in 97-98. Search startups should be praying to the spam gods for a second opportunity. :-)


