« How Fake Luxury Conquered the World | Main | What's up Rich »

blekko is hiring

blekko is building a new search engine from scratch and I'm looking to hire a few more coders.

Search is an absolutely fascinating problem to work on for a bunch of reasons. For one thing you have to scale the thing before getting the first user. You can't just start with a server or two and add more when the users come. Step 1 is to copy the internet onto your cluster. Step 2 is to analyze it..

The componentry is remarkably deep.

Search is like 7 hard problems wrapped into a stack. Distributed systems, html analytics, text analytics/semantics, anti-spam, AI/ML, frontend/UI. And scale... Apart from the sexy high end algos there are also the boring 10-year old system libraries and off-the-shelf tools that crack under stress and sometimes need a look. You open the hood and wonder how the thing ever worked in the first place...

Plus there is always something fresh and new every day mining through the vast sordidness of the many billions of pages on the web. You expect to be amazed at the endless varieties of crazy porn domains and new approaches to webspam. But there are equal horrors in the small, finding pathological charset issues, previously-undiscovered abominable server implementations, psychopathic website owners. The web is a reactive fuzz test.

I know there are some great coders out there reading this blog who would have blast working on some of the pieces here that need to get built. This is a great opportunity to join an experienced team early building a big system from the ground up. If you think you might be interested, send me an email and we can chat.

fyi our interviews always have coding tests. Primarily we are looking for folks who love to write code and are good at it. :)

TrackBack

TrackBack URL for this entry:
http://www.skrenta.com/mt/mt-tb.cgi/245.

Comments (25)

" You can't just start with a server or two and add more when the users come."

Dude... just run Ruby on Rails. Problem solved!

" You can't just start with a server or two and add more when the users come."

Dude... just run Ruby on Rails. Problem solved!

SHumphreys:

Note recent article on Slashdot...
"According to TechCrunch, Twitter has plans to abandon Ruby on Rails after two years of scalability issues."

Well, the 8th and most important problem of search engines is being special, and giving more than Google. Good luck with your development, but as I have a blog search engine, I know it's a hard question.

Steve Iams:

Good Luck Rich!

Sounds like a fascinating project :)
Wish I could help >.> If I wasn't such a noob I would probably be all over this.

Best of luck to you :)

Interesting. I came up with an algorithm a while back that does not rely on inbound links whatsoever as a factor. Instead it looks for advanced and simple seo black hat items and penalizes them, then takes into account variables to determine which website is the most relevant for a given search word/phrase.

While it uses an advanced point system, the actual points negative or positive paired with the content the points are given-to/subtracted-from are the secret behind a successful relevant search engine.

I am curious to see how you can implement something similar to overtake Google ;)

Jose:

Someone said earlier that its important to give more than google.

I am in disagree. The trick is to offer less than google !

its all in the AI.

Best

Jose

I have found your project by a crawler on my website and i was interested where come that crawler from.

sorrys for my "bad" english

Good luck guys !

Greets from Germany
Marc


P.S. sorrys for my "bad" english

hey this looks good, seem you guys in my crawler stats, you may index my site and all links from it.

keep up the good work!!!

holly

Roy:

Isn't it kinda hard to just start a search-engine. I mean, you might make one, but how will you bet people to go onto it, rather than google?

Good luck...you'll need it

Holly,

Google did it a number of years ago. It just takes one heck of a marketing plan and the cpital to push it properly.

Good luck Rich

Sean:

Good Luck with the David and Goliath Thing.

PS, Can you put my website on the #1 Spot when you launch :-)

Your crawler came knocking so i wanated to see what's up. Too bad no free samples!!!
LOL

Is it possible to open some parts of the project as opensource? I would love to start helping with very simple parts like discussions and creating mobile interfaces, for example.

Not to be a Debbie Downer, but if everyone's a coder, who's running the asylum?

We have 10 engineers and 1 vp-of-everything-else. :)

Justin Van Winkle:

When you mentioned the internet-as-fuzztest I thought of something that blew my mind.

Imagine someone attacking your app (which is attempting to validate, index, and carefully analyze the structure of input) every day with very ingenious and carefully crafted bogus input. Now imagine someone doing it with gigabytes of bogus input. Now imagine doing it with hundreds of gigabytes. Now imagine another 200k people doing it, and lots of them are smarter than you, way way smarter, and know way more. Now imagine your app can't crash or serve dangerous content, ever, or you're screwed. And it can't serve innocuous but bogus content, or you fail.

Good luck!

shocky:

Hey guys,
You guys are doing great, you have the knowledge and experience to do this, don't let anyone stop you or let you down. Make sure you also focus on a good revenue plan, a good source of income for the company and all will be great!

Wish you guys the best, and everytime you feel that you are under pressure of the larger corporations, just kick some ass!

Mark:

Good luck! Hopefully you guys will get at least 1% of the market share! If you need a designer to do some work, even for free (for now) lemme know. =)

Congrats on your recent funding success! Hope all goes well!

Hey, this is a great idea. but can you tell me / us, when the beta is starting? i'd like to get my hands on it.

c-ya
n

Eric Kotonya:

With lots of Web 2.0 sites yet to be properly crawled and indexed by older engines, technical there is a gap for improvement - and business.

I think u guyz can pull it off and build a valuable product with some web analytics and collective intel techniques.

Search works when you know exactly what you want and how the world refers to it. Otherwise, your SOL!

When presented to the user, search is exceedingly simple. Do the search-results fit that user's contextual needs at that moment? Or, like Google, Bing, Wisenut, MSN, Yahoo, and all the other "search engines" did you pepper them with offers for "Nikes" when they were there researching arch designs for flat-footed autistic kids? If they searched on "that info" you and I know they'd have seen everything offered from soup-to-nuts, so even search-specificity has been programmed out of the user's search toolkit by the irrelvance inspired Keywords industry!

The long-tail exists mostly because nobody wants that stuff. Not the first time, and certainly not offers 2 through 7,954 from Overstock.com and Smartbuys.com.

Nope, if you can contextualize the user's visit and consider "now" to be more than a mere regurgitation of "then", perhaps you can then use your signal processing skills applied to streams of "cogently-classified" data points presented as a sequence of internet transactions to predict what that user is looking for right now!

Culturally cognet content-classification is the only workable answer, and using semitotics and signal processing is the way I do it now instead of pretending to do it before with vector-cosines and high-performance computing scientists making a mess of every analytic they need but can't quite figure out why!

TV

So, now that Yahoo! has finally given in to Microsoft.. what does this mean for Blekko? Still moving forward? Any updates?

By the way, I do like the name. Maybe change it to something that sounds powerful like that, but that doesnt sound dumb when using it instead of "google". You cant really "bing" things. "Blekkoing" sounds off too (and I know this is only a stealth name), but you might be on to something..

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on May 1, 2008 12:11 PM.

The previous post in this blog was How Fake Luxury Conquered the World.

The next post in this blog is What's up Rich.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33