PDA

View Full Version : how search results are stored



adamdaly
August 12th, 2007, 10:37 AM
hey, i was wondering how when a search is completed and the results are returned, how are they stored. Like if i search with google and i get maybe a million results, where are those million results stored. I guess some kind of array but thats a lot of data, so it would have to be server side. but can any one enlighten me further.

thanks

adam

blazes
August 12th, 2007, 02:37 PM
In a database.

Charleh
August 12th, 2007, 07:17 PM
Database man DATABASE!!!

Databases are amazing :D

SQL 2005 ... how anything returns 2 squillion results in 0.00234 seconds I'll never know :P

blazes
August 12th, 2007, 07:42 PM
^I'm quite sure it guesstimates, then finishes returning results in the background. Not to mention they use assembly.

kdd
August 12th, 2007, 09:21 PM
how anything returns 2 squillion results in 0.00234 seconds I'll never know

It doesn't actually return that many results (UNLESS you ask for it). For example, google returns about 10 results per page by using "LIMIT start, 10" in SQL.

For example, searching for "google" in google returns some stuff, then if you click on next, this is the url: /search?q=google&hl=en&client=firefox-a&rls=org.mozilla:en-US:official&hs=y4i&start=10&sa=N

Look at the 2nd from the last parameter: start=10. This tells the SQL where to start searching from and return only 10 results from that point.

And yes, databases are incredibly amazing! :)

Also, code gets converted TO assembly, but I 100% doubt anyone codes directly INTO assembly. For example, PHP code on the server would get converted to assembly and then execute on the server.

Sorry if there is some miscommunication.

adamdaly
August 13th, 2007, 05:25 AM
don't worry i know all about databases, well a little. so it doesn't actually cache all the results from the query for use with the next page of results, but runs the query again with an offset?

adam

mlk
August 13th, 2007, 06:33 AM
don't worry i know all about databases, well a little. so it doesn't actually cache all the results from the query for use with the next page of results, but runs the query again with an offset?

adam


Google's and Yahoo's DBs/queries are probably improved beyond human understanding; just like with a normal server, a lot of queries must be cached/stored in rapid memory (search for 'ipod' for instance) when they are extremely popular. They use a myriad of small HDs/server racks (instead of huge IBM-style supercomputers) and have each index stored at least in 3 different physical locations. They also spread out clusters geographically which means you less likely to be slowed down by traffic (ie google is fast, not just query-wise)

I'm sure if you look around you can find some more interesting facts, but search results is more than simply querying your db. It involves technological/hardware improvements and code ingenuity for sure (like a viable fuzzy search)

A long time ago I read http://xooglers.blogspot.com/ (ex-googlers blog) and if I remember correctly it had insights on how the monster worked, but I can't be bothered to read through it all again.

blazes
August 13th, 2007, 03:11 PM
\
Also, code gets converted TO assembly, but I 100% doubt anyone codes directly INTO assembly. For example, PHP code on the server would get converted to assembly and then execute on the server.

Sorry if there is some miscommunication.

The search is coded directly with assembly. The site itself, I believe, is python.