Page 1 of 1

Spider?

Posted: Fri Dec 03, 2004 5:04 am
by Dale
How would you go about creating a spider like google's? (Thats if they use PHP)

Posted: Fri Dec 03, 2004 7:50 am
by Archy
Hmm, as far as I know, you cant make one in PHP, you would need a programming language such as C#.

Posted: Fri Dec 03, 2004 8:26 am
by kettle_drum
You can make one very simply using file_get_contents(). Just start off by grabbing one page with this function, and then parse the results to find all the links in that page, and then repeat on this links found etc etc. Then just add what else you want it to do when your parsing the page, save the title, find keywords etc - whatever you can think of.

Posted: Fri Dec 03, 2004 9:28 am
by EricS
See the following topic.

viewtopic.php?t=28256

I posted the page fetching class I use for a spyder I wrote.

Posted: Fri Dec 03, 2004 11:24 am
by patrikG
Archy wrote:Hmm, as far as I know, you cant make one in PHP, you would need a programming language such as C#.
Why is that? Acquaint yourself with PHP's cURL extension and regular expressions and you can write a spider easily.

Would be very interested in learning about a web-application you can't build in PHP, but only in C# or any other language.
Don't underestimate PHP's capabilities.

Posted: Fri Dec 03, 2004 12:07 pm
by EricS
patrikG wrote:
Archy wrote:Hmm, as far as I know, you cant make one in PHP, you would need a programming language such as C#.
Why is that? Acquaint yourself with PHP's cURL extension and regular expressions and you can write a spider easily.

Would be very interested in learning about a web-application you can't build in PHP, but only in C# or any other language.
Don't underestimate PHP's capabilities.
I haven't been able to get a version of my spyder that can spawn multiple versions of itself. I'm only able to spyder one site at a time.

Posted: Fri Dec 03, 2004 6:13 pm
by kettle_drum

Posted: Fri Dec 03, 2004 6:29 pm
by rehfeld
also take a look at exec() to fork processes