Page 1 of 1

Creating a search engine

Posted: Mon Mar 21, 2005 11:03 am
by JF3000
Would someone please point me in the direction of how to create a search engine where people would be able to carry out searches like using google.com etc?

Thank you.

JF3000

Posted: Mon Mar 21, 2005 11:08 am
by feyd
what are you particularly after, creating a search engine on your site, or creating a search engine that doesn't do any searching itself, instead just returns data from search engine(s) :?:

Posted: Mon Mar 21, 2005 11:13 am
by pickle
The simplest search engine-type functionality I've seen is using MySQL's FULL TEXT index: http://dev.mysql.com/doc/mysql/en/fulltext-search.html

Posted: Mon Mar 21, 2005 11:42 am
by JF3000
Look at google.com I would like to create something like that, not some search thing that searches a website, Im looking at the big picture here, where people submit there sites, and based on keywords those sites are displayed etc.

Posted: Mon Mar 21, 2005 11:47 am
by feyd
You want to create something like Google? Why? Is that not fulfilled by Google already? .. or do you want to concentrate on a niche market? What's your big picture? 'cause taking over Google isn't a simple task, nor is it cheap.

Posted: Mon Mar 21, 2005 12:35 pm
by m3mn0n
Yeah, I would plan take over Google too but I don't have enough cash to buy 20,000 servers. :lol:

Posted: Tue Mar 22, 2005 6:08 pm
by JF3000
How do I go about creating a search engine just like google, and dont tell me what I can and cant do, all I want to know is how they do it, and what I have to do to perform the same function.

Its just not google, its every other search engine out there as well. I would like to know how they are programmed, so I can create my own, for people to submit there websites too and go from there.

And the link above is full of errors, with people complaining about the code etc. I am not interested in sites like that, thank you.

JF3000

Posted: Tue Mar 22, 2005 6:44 pm
by feyd
read up on spiders and agents; autonomous programs that scour the internet for links, basically. They need to cache and classify the data on pages as well, so you can more easily pull relevance. Having a good knowledge of the language (spoken) used helps to search for similar texts..

Posted: Tue Mar 22, 2005 7:52 pm
by Ambush Commander
Essentially, your going to have a very big database of links, keywords, and relevancy rankings. You're going to have to develop an algorithm to determine the importance of all these links, and be able to do it all in a very quick amount of time. Then you have to launch multiple daemons to "crawl" the web (which, of course, you will not have the resources to do quickly) and it will take you a long, long time, to finish crawling anything that would be satisfactory.

Search Engines like Googles don't rely on people submitting links. they crawl.