Page 1 of 1

Help! Runaway Crawler!!

Posted: Sun Dec 02, 2007 11:37 am
by beserker
I've just been developing a simple web crawler today, but it went "off-piste" and started crawling the entire web (well it got as far as Google before I pressed 'stop' in my browser).

The thing is I'm worried that the script's still running even though I did press stop - can anyone clarify this?

The setup is this - the script basically starts at a certain URL, get's the links out of that page then follows them one by one getting more links etc. etc. - each time it finds a new link it print()'s it to the browser - so I sit there watching the links appear as the script runs. The core component of it is a recursive loop so I'm worried that it'll never stop...

Posted: Sun Dec 02, 2007 12:10 pm
by feyd
Check the processes on the machine. If PHP is actively running (and likely eating the processor) chances are, it's still running. Get or use your operating system's favored process killing program to kill it if it's still running.

Posted: Sun Dec 02, 2007 12:26 pm
by beserker
That's it - I can't - it's on a remote host (shared hosting) and the support guy said that he "couldn't stop PHP" and that I had to send a support email in... brilliant.

Posted: Sun Dec 02, 2007 12:35 pm
by feyd
So.. do it. :?

Posted: Sun Dec 02, 2007 1:18 pm
by beserker
I have... no reply yet... but what I was wondering about was whether the script would keep executing after pressing 'stop' in the browser.

Posted: Sun Dec 02, 2007 2:15 pm
by John Cartwright
Possibly but likely it has timed out, depending on your configuration for ignore user about and maximum execution

Posted: Sun Dec 02, 2007 3:56 pm
by beserker
Thanks - I spose the maximum execution time would have cut it off anyway then - I hadn't heard of that before

What's that ignore user about thing though? I couldn't see anything like that in php.ini