Download images from url list

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
idotcom
Forum Commoner
Posts: 69
Joined: Thu Mar 04, 2004 9:24 pm

Download images from url list

Post by idotcom »

Hi

I am trying to find the best way to go about downloading about 25,000 images from a text file url list.

Example: About 25K of these:
http://www.somewebsite.com/images/download/98348024.jpg

I'm not stealing these images, I just need to download all of them as an affiliate.

All the images are from the same website, and the average image size is probably 30-45kb.

Anyhow, does anyone have any ideas on how to do this automatically and without swamping the site?

I imagine it's going to be about 1gb of data. 8O


Thank you. :D
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

text file url list
if the urls are each on their separate line:

Code: Select all

define('SAVE_DIR','C:/images/');
$urls = file('list_of_urls.txt');
for ($i=0, $j=count($urls); $i<$j; $i++) {
    $img = file_get_contents($urls[$i]);
    if(!trim($urls[$i])) {
        echo $i . '] line empty, skipped';
        continue;
    }
    $saveLocation = SAVE_DIR . basename($urls[$i]);
    if(!$h = fopen($saveLocation, 'w')) {
        echo $i. '] Couldn\'t open ' . $saveLocation . ' for writing<br />';
        continue;
    }
    fwrite($h, $img);
    fclose($h);
    echo $i . '] success! <br />';
}
idotcom
Forum Commoner
Posts: 69
Joined: Thu Mar 04, 2004 9:24 pm

Post by idotcom »

Hi

thanks for the info. But how do you think this would do? I mean without some kind of pauses. This looks like it will just run continuously untill the list is done. Wouldn't the script time out? And I think that would put a strain on my server and the other site, no?


Thanks :D
User avatar
aerodromoi
Forum Contributor
Posts: 230
Joined: Sun May 07, 2006 5:21 am

Post by aerodromoi »

idotcom wrote:Hi

thanks for the info. But how do you think this would do? I mean without some kind of pauses. This looks like it will just run continuously untill the list is done. Wouldn't the script time out? And I think that would put a strain on my server and the other site, no?


Thanks :D
1GB of data always puts some strain on the server ;)

A few options to ease this problem:
a) restricting the number of downloads per call (downloading 300 images at a time - while deleting these images from the file "to-do" list)
b) using cronjobs
c) using a combination of both (from 3 to 5am the load on the servers should be minimal)
d) asking your affiliate to send you a dvd with these images.
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

yeah put a flush() at the top of the for loop.

then run it for as long as it will run. because it prints out $i you can see where it stops.
then update the script to start from where it left off:

Code: Select all

for($i=242...
repeat :)
And I think that would put a strain on my server and the other site, no?
only as fast as the internet connection speed and how else you gonna do it?
Post Reply