Spider?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Dale
Forum Contributor
Posts: 466
Joined: Fri Jun 21, 2002 5:57 pm
Location: Atherstone, Warks

Spider?

Post by Dale »

How would you go about creating a spider like google's? (Thats if they use PHP)
Archy
Forum Contributor
Posts: 129
Joined: Fri Jun 18, 2004 2:25 pm
Location: USA

Post by Archy »

Hmm, as far as I know, you cant make one in PHP, you would need a programming language such as C#.
kettle_drum
DevNet Resident
Posts: 1150
Joined: Sun Jul 20, 2003 9:25 pm
Location: West Yorkshire, England

Post by kettle_drum »

You can make one very simply using file_get_contents(). Just start off by grabbing one page with this function, and then parse the results to find all the links in that page, and then repeat on this links found etc etc. Then just add what else you want it to do when your parsing the page, save the title, find keywords etc - whatever you can think of.
EricS
Forum Contributor
Posts: 183
Joined: Thu Jul 11, 2002 12:02 am
Location: Atlanta, Ga

Post by EricS »

See the following topic.

viewtopic.php?t=28256

I posted the page fetching class I use for a spyder I wrote.
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

Archy wrote:Hmm, as far as I know, you cant make one in PHP, you would need a programming language such as C#.
Why is that? Acquaint yourself with PHP's cURL extension and regular expressions and you can write a spider easily.

Would be very interested in learning about a web-application you can't build in PHP, but only in C# or any other language.
Don't underestimate PHP's capabilities.
EricS
Forum Contributor
Posts: 183
Joined: Thu Jul 11, 2002 12:02 am
Location: Atlanta, Ga

Post by EricS »

patrikG wrote:
Archy wrote:Hmm, as far as I know, you cant make one in PHP, you would need a programming language such as C#.
Why is that? Acquaint yourself with PHP's cURL extension and regular expressions and you can write a spider easily.

Would be very interested in learning about a web-application you can't build in PHP, but only in C# or any other language.
Don't underestimate PHP's capabilities.
I haven't been able to get a version of my spyder that can spawn multiple versions of itself. I'm only able to spyder one site at a time.
kettle_drum
DevNet Resident
Posts: 1150
Joined: Sun Jul 20, 2003 9:25 pm
Location: West Yorkshire, England

Post by kettle_drum »

rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

also take a look at exec() to fork processes
Post Reply