Yahoo's is "Slurp"
Cuill's is "Twiceler"
It makes sense have a friendly robot user agent, so nervous webmasters won't ban it. You don't want to call your crawler 'sitejacker' or something.. Unfortunately my favorite candidates were:
Crawlhammer
Webraker
Lurchy
Client9
hmmm. :-|
"Oh no! It's CrawlHammer!!"
If even in your heart you hide the urls ... there it shall rake for them...
...
Does anyone know what the purpose of a '+' in front of an url in the robots user-agent is? Some sites put in the '+', others don't...
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)Mozilla/5.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)
Gigabot/3.0 (http://www.gigablast.com/spider.html)
Comments (5)
My favorite robot name was the old Interpix robot, "iSpi"
This crawled for images for the Image Search feature that they supplied to Yahoo in the middle late 1990s.
Posted by Mark | April 16, 2008 7:33 PM
Posted on April 16, 2008 19:33
A bunch of Google old-timers came together today on an email thread to discuss the background on the '+'. I'll spare you the story and just let you know that you don't need to put a plus sign in the user-agent.
Posted by Matt Cutts | April 16, 2008 7:40 PM
Posted on April 16, 2008 19:40
Thanks Matt! But I'd still love to hear the story... :)
Posted by Rich Skrenta | April 16, 2008 11:04 PM
Posted on April 16, 2008 23:04
I'd recommend something like Slimey, the worm that Oscar the Grouch watches over, but webmasters might be a bit leery of worms as well. :)
Let's dissect what the fears generally are:
1. It might go the way of Cuill and take down the damn webserver (we had to ban Cuill's IP range for doing this).
2. It might just be a scraper.
So, if you can get something that conveys the "I'll go slowly and not steal from you" message, win for you.
How about...
Safeslug
Snaildex
Charlotte (you know, from Charlotte's Web)
Posted by Cygnus | April 17, 2008 7:26 AM
Posted on April 17, 2008 07:26
Here are some of my favorites from our logs:
DuckDuckBot/1.0 I'll play this with my kids this weekend.
focuseekbot, Do you pronounce that the F-U seek bot?
Following the + in the URL meme, how about ++ before https?
CityTwist/0.1;++https://
Posted by Jason Culverhouse | April 18, 2008 9:35 AM
Posted on April 18, 2008 09:35