OK
I am looking to build a specific Internet search site and I what to use PHP. I need scripts/code/packages that will crawl or spider to a list of predetermined sites, pull keywords/metatag/titles(and such) from these site and write the parsed date to mySQL (with indexing) and then the user will search my site that is specific to my searches.
I have been search for package but I am not finding what I am looking for. I have found a package that is called Harvest that is close.
Does anyone have any ideas on the best solution. I do not want a meta search engine nor do I need a small site search engine.
Any info would be appreciated!
Thanks,
PHP and Internet Search Crawlers/engines
Moderator: General Moderators
-
jadformosa
- Forum Newbie
- Posts: 1
- Joined: Tue Apr 20, 2004 8:53 am
-
kettle_drum
- DevNet Resident
- Posts: 1150
- Joined: Sun Jul 20, 2003 9:25 pm
- Location: West Yorkshire, England
You could easily make one for yourself. Just get a database to hold urls to crawl, then have your bot connect to that site - you can do it with fopen(). Then you can parse the page to get what you want - meta tags, text from the page etc. Then store these details in the search engines database.
You can of course then make things as hi-tech as you like - get the bot to collect all links from a page so it will traverse the web looking for more links, have it record how many other pages link another page, etc.
You can of course then make things as hi-tech as you like - get the bot to collect all links from a page so it will traverse the web looking for more links, have it record how many other pages link another page, etc.
- Buddha443556
- Forum Regular
- Posts: 873
- Joined: Fri Mar 19, 2004 1:51 pm