« How Fake Luxury Conquered the World | Main | What's up Rich »

blekko is hiring

blekko is building a new search engine from scratch and I'm looking to hire a few more coders.

Search is an absolutely fascinating problem to work on for a bunch of reasons. For one thing you have to scale the thing before getting the first user. You can't just start with a server or two and add more when the users come. Step 1 is to copy the internet onto your cluster. Step 2 is to analyze it..

The componentry is remarkably deep.

Search is like 7 hard problems wrapped into a stack. Distributed systems, html analytics, text analytics/semantics, anti-spam, AI/ML, frontend/UI. And scale... Apart from the sexy high end algos there are also the boring 10-year old system libraries and off-the-shelf tools that crack under stress and sometimes need a look. You open the hood and wonder how the thing ever worked in the first place...

Plus there is always something fresh and new every day mining through the vast sordidness of the many billions of pages on the web. You expect to be amazed at the endless varieties of crazy porn domains and new approaches to webspam. But there are equal horrors in the small, finding pathological charset issues, previously-undiscovered abominable server implementations, psychopathic website owners. The web is a reactive fuzz test.

I know there are some great coders out there reading this blog who would have blast working on some of the pieces here that need to get built. This is a great opportunity to join an experienced team early building a big system from the ground up. If you think you might be interested, send me an email and we can chat.

fyi our interviews always have coding tests. Primarily we are looking for folks who love to write code and are good at it. :)

Comments (34)

" You can't just start with a server or two and add more when the users come."

Dude... just run Ruby on Rails. Problem solved!

" You can't just start with a server or two and add more when the users come."

Dude... just run Ruby on Rails. Problem solved!

SHumphreys:

Note recent article on Slashdot...
"According to TechCrunch, Twitter has plans to abandon Ruby on Rails after two years of scalability issues."

Well, the 8th and most important problem of search engines is being special, and giving more than Google. Good luck with your development, but as I have a blog search engine, I know it's a hard question.

Steve Iams:

Good Luck Rich!

Sounds like a fascinating project :)
Wish I could help >.> If I wasn't such a noob I would probably be all over this.

Best of luck to you :)

Interesting. I came up with an algorithm a while back that does not rely on inbound links whatsoever as a factor. Instead it looks for advanced and simple seo black hat items and penalizes them, then takes into account variables to determine which website is the most relevant for a given search word/phrase.

While it uses an advanced point system, the actual points negative or positive paired with the content the points are given-to/subtracted-from are the secret behind a successful relevant search engine.

I am curious to see how you can implement something similar to overtake Google ;)

Jose:

Someone said earlier that its important to give more than google.

I am in disagree. The trick is to offer less than google !

its all in the AI.

Best

Jose

I have found your project by a crawler on my website and i was interested where come that crawler from.

sorrys for my "bad" english

Good luck guys !

Greets from Germany
Marc


P.S. sorrys for my "bad" english

hey this looks good, seem you guys in my crawler stats, you may index my site and all links from it.

keep up the good work!!!

holly

Roy:

Isn't it kinda hard to just start a search-engine. I mean, you might make one, but how will you bet people to go onto it, rather than google?

Good luck...you'll need it

Holly,

Google did it a number of years ago. It just takes one heck of a marketing plan and the cpital to push it properly.

Good luck Rich

george:

discovered your bot via my log file

and seeing the ip as coming from psi/cogenco
(I have always blocked psi ip's due to bad bots
and bad activity)....

I also read the tidbit of your history that you
wrote (one of the first) viruses it is something
that I would not be proud of and I would not want
a bot hitting my site knowing that aspect....

I will be blocking your bot(s) and the ip range
from psi (robots.txt file is usually rather
useless and is often ignored so I use .htaccess
which is much more effective.

Doubtful that you could create a SE better than
the existing major ones, MSN, Yahoo, and Google
(the latter is poor since they started with page
rank that yanked out small sites and they have
bad policies eg can sandbox without telling anyone
and without valid reasons).

Sean:

Good Luck with the David and Goliath Thing.

PS, Can you put my website on the #1 Spot when you launch :-)

Your crawler came knocking so i wanated to see what's up. Too bad no free samples!!!
LOL

Is it possible to open some parts of the project as opensource? I would love to start helping with very simple parts like discussions and creating mobile interfaces, for example.

Not to be a Debbie Downer, but if everyone's a coder, who's running the asylum?

We have 10 engineers and 1 vp-of-everything-else. :)

Hi,

Want any graphics or UI stuff doing?

If so, drop me some specs and I'll send you some stuff. Glossy, aqua, grungy etc. all fine. Happy to do something different too.

DS_UK

I found this after noticing the crawler on my site. However, my site has been up and running since December and it appears this is the first time. Google hits my site several times a day. I'm not sure how they can do this and hit the massive amount of sites. I can't imagine what it's going to take to beat googles infrastructure.

I'm interested to work at blekko.

Here is my resume. I'm uniquely intelligent and creative.

Reed S. Kotler

1030 East El Camino Real, Suite 278
Sunnyvale, CA 94087
main: (408) 836–3774 alternate: (408)730–9557
website: http://www.reedkotler.com
http://www.toriirecords.com
email: reedkotler@hotmail.com

Objective

Find a job with challenging problems requiring a uniquely creative, persistent and intelligent individual.

Programming and Systems Expertise

Able to quickly develop programming solutions to solve client problems using:

C , C++ and C#, Objective C, Java, Lisp, FORTRAN, Ada, Pascal, Modula 2, Matlab
ADO.NET, ASP.NET, MFC
Java, JavaScript, HTML, Perl, PHP, Python, MYSql, AJAX
Windows, Unix/Linux, Mac OS, WinCE , Window Mobile
Microsoft SQL
Assembly Languages (80x86 family, 68k family, PPC family, MIPS family, ARM Family, Space Shuttle AP101)
Windows GUI, Mac and Iphone GUI/Cocoa
Protools (110 certification), Reason, Final Cut Express, Avid
Adobe CS 4

Experience

Reed Kotler Systems, Inc. (President and co founder) 1997 – present
Created commercial audio/music analysis hardware devices and software-only versions, (Windows and Mac) available from www.reedkotler.com
Developed Unix-like development toolset for Windows 95/NT (port of GNU tools).
Consulting for WebTV/Microsoft, MIPS, Palm, IBM, Lockheed, Sun Microsystems, and others.
GCC/GDB/GNU rehosting and modification.

Independent Consultant to various clients, 1986 – present
Maintained Microsoft Platform Builder debugger for WIN CE for Microsoft
Developed TV Set top box and mobile software for various Microsoft Products
Developed Crashlog software for several Microsoft TV Products which included an extensive .NET server application.
Developed Server application for Microsoft for Billing and Subscription for a Microsoft TV Product.
Development on ISI Searchlight Debugger. Ported to ARM, PPC, MIPS, 68K, others.
Ported Suns Debugger from Solaris to HP/UX.
Ported MetaWare C/C++ compiler to Windows NT/98. Worked on PPC elf linker.
Design of sophisticated AI software for Lockheed Satellite program
Development on Intelligence System for ESL.
Designed a large and complex relational database application written in Ada to support a sophisticated satellite ground station. Performed the database structure and requirements analysis as well as software design and coding.
Development on intelligence system written in Ada. Designed and implemented database services for the application making complex use of UNIX system services for shared memory, semaphores, and TCP.
Consulting related to the Ada programming language, overall software systems design and prototyping, DBMS application design, and software systems requirements analysis.
Troubleshot Ada code in weather-forecasting satellite ground station resolving various concurrency issues and other problems.

President and founder of Reed Kotler Music Inc, 1998 – present
Development of hardware and software music products.
Produced TR-1000 and TR-400 Digital music study recorders
Produced LBR-100 lead/bass isolator
Produced Midi Brick synthesizer
Managed employees, did marketing and sales, trade shows, supervised manufacturing, parts procurement. Products had software, electrical and mechanical design components.

President and founder of Torii Records, Inc. 2001 – present
Production and marketing of Jazz music.
Produced 8 Cds
Three Cds were in the top 10, one at #2, one at #7, one at #9, one at #11 and one at #19 in the USA on Jazz radio and one was nominated for a Grammy in 2005. All Cds have been played extensively on jazz radio the in USA, on Satellite Radio, on Cable TV Radio, and in Europe and Canada.

Composer/Musician – present
Internationally known composer. Most recent Cd was #9 in the USA on jazz radio.
Faculty member of Stanford Jazz Summer Residency Program for over 10 years. Teaching composition, transcribing, theory, harmony, improvisation theory.
Plays Piano, Guitar, Saxophone, Bass, Drums, Latin Percussion.
Staff transcriber for many years for Jazz Improv Magazine.


General Transformation Corporation, Berkeley, CA (VP Engineering) 1984-1986
Design and implementation of Ada language compiler for the IBM PC under DOS.
Design and implementation of full LR(1) parser generator and other development tools including a file comparison program (written up in Jerry Pournelle’s BYTE magazine column.)

Lockheed Missiles and Space Co., Sunnyvale, CA (Scientific Programming Analyst) 1981 – 1984
Design and implementation of INGRES-like relational database management system written in Ada.
Participated in (and received commendations for) review work on the Ada language and reference manual.
Chief designer and technical supervisor for generic Communications, Command, Control and Intelligence (C3I) system written in Ada.

Strategic Information Burlington, MA (Computer Scientist) 1980-1981
Development on design and implementation of proprietary language for econometric forecasting.

Intermetrics Inc. Cambridge, MA (Sr. Sys. Analyst/ Programmer) 1975 - 1977
Support of company’s Space Shuttle program, including development of tools for software configuration, management, and quality assurance.
Maintenance of mathematics and character libraries for HAL/S compilers for primary onboard space shuttle computers.
Maintenance and rehosting of XPL language compilers.
Received award from NASA for a technical innovation.

Foreign Languages
TRKI 2 level Russian.


Education

Antioch College (Yellow Springs, Ohio), BS in Mathematics, 1977.

Justin Van Winkle:

When you mentioned the internet-as-fuzztest I thought of something that blew my mind.

Imagine someone attacking your app (which is attempting to validate, index, and carefully analyze the structure of input) every day with very ingenious and carefully crafted bogus input. Now imagine someone doing it with gigabytes of bogus input. Now imagine doing it with hundreds of gigabytes. Now imagine another 200k people doing it, and lots of them are smarter than you, way way smarter, and know way more. Now imagine your app can't crash or serve dangerous content, ever, or you're screwed. And it can't serve innocuous but bogus content, or you fail.

Good luck!

shocky:

Hey guys,
You guys are doing great, you have the knowledge and experience to do this, don't let anyone stop you or let you down. Make sure you also focus on a good revenue plan, a good source of income for the company and all will be great!

Wish you guys the best, and everytime you feel that you are under pressure of the larger corporations, just kick some ass!

Rick Blodgett:

Do you need an experienced Quality Assuance Tester? I would love the opportunity to work on this project. Please feel free to contact me!

And...who's your management team and do they need any help at that upper level? Let me know, am looking for interesting ways to make Yahoo & MSN quiver in their boots. If someone displaces them and starts to put the clamps on Google...life will be interesting.

Michael Murdock, CEO
DocMurdock.com

Mark:

Good luck! Hopefully you guys will get at least 1% of the market share! If you need a designer to do some work, even for free (for now) lemme know. =)

Congrats on your recent funding success! Hope all goes well!

Hey, this is a great idea. but can you tell me / us, when the beta is starting? i'd like to get my hands on it.

c-ya
n

Eric Kotonya:

With lots of Web 2.0 sites yet to be properly crawled and indexed by older engines, technical there is a gap for improvement - and business.

I think u guyz can pull it off and build a valuable product with some web analytics and collective intel techniques.

Search works when you know exactly what you want and how the world refers to it. Otherwise, your SOL!

When presented to the user, search is exceedingly simple. Do the search-results fit that user's contextual needs at that moment? Or, like Google, Bing, Wisenut, MSN, Yahoo, and all the other "search engines" did you pepper them with offers for "Nikes" when they were there researching arch designs for flat-footed autistic kids? If they searched on "that info" you and I know they'd have seen everything offered from soup-to-nuts, so even search-specificity has been programmed out of the user's search toolkit by the irrelvance inspired Keywords industry!

The long-tail exists mostly because nobody wants that stuff. Not the first time, and certainly not offers 2 through 7,954 from Overstock.com and Smartbuys.com.

Nope, if you can contextualize the user's visit and consider "now" to be more than a mere regurgitation of "then", perhaps you can then use your signal processing skills applied to streams of "cogently-classified" data points presented as a sequence of internet transactions to predict what that user is looking for right now!

Culturally cognet content-classification is the only workable answer, and using semitotics and signal processing is the way I do it now instead of pretending to do it before with vector-cosines and high-performance computing scientists making a mess of every analytic they need but can't quite figure out why!

TV

So, now that Yahoo! has finally given in to Microsoft.. what does this mean for Blekko? Still moving forward? Any updates?

By the way, I do like the name. Maybe change it to something that sounds powerful like that, but that doesnt sound dumb when using it instead of "google". You cant really "bing" things. "Blekkoing" sounds off too (and I know this is only a stealth name), but you might be on to something..

Nathan:

Hello, I am a college student VERY interested in things like distributed systems and natural language processing. I'd like to help work on this project any way I can, even for free, or at least open some channels of communication. I can script Java, Lisp, and C++.

eli:

I was reading this blog, and am very fond of the idea of new search to replace/ challange Google.

I own the domain name WhySearch.com and would be interested in making it part of this endeavor, if this is still on track.
Eli

Just checked out the Beta, wow when you look back at this post and see all the work that came together you guys have done a great job. I wasn't expecting much but may actually start using Blekko as my default search.

Awesome work guys, the SEO features are great.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on May 1, 2008 12:11 PM.

The previous post in this blog was How Fake Luxury Conquered the World.

The next post in this blog is What's up Rich.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33