Open html file, download images

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
JKM
Forum Contributor
Posts: 221
Joined: Tue Jun 17, 2008 8:12 pm

Open html file, download images

Post by JKM »

Hi there!

I want to open a html file (http://xx.se/album123.html), parse through the file, find all links that contains "/pix.php?source=", and wget/download the link after source=:
"/pix.php?source=http://xx.se/albumfiles/Jfnb83njHfm2kJ.jpg"

- There might be multiple image links in the html file
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Re: Open html file, download images

Post by s.dot »

Is this legal behavior?

Anyways you would open the file with file_get_contents() [among other ways]
Parse the file for links using a regular expression - preg_match_all().
Loop through the matched links and download the link match file, again using file_get_contents() or another similar way.

What have you tried?
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
JKM
Forum Contributor
Posts: 221
Joined: Tue Jun 17, 2008 8:12 pm

Re: Open html file, download images

Post by JKM »

s.dot wrote:Is this legal behavior?

Anyways you would open the file with file_get_contents() [among other ways]
Parse the file for links using a regular expression - preg_match_all().
Loop through the matched links and download the link match file, again using file_get_contents() or another similar way.

What have you tried?
I see why you think it's illegal behaviour, but it's my images I want to download.

I haven't coded anything for almost two years, and I've always been terrible with RegEx, so I might need some help with that. :p (I just need href="pix.php?source=X")

Thanks :)
Last edited by JKM on Tue Feb 19, 2013 5:40 am, edited 1 time in total.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Re: Open html file, download images

Post by s.dot »

Well some pseudo code might go a little bit like this

Code: Select all

<?php

//html file you want to open
$htmlFile = 'http://www.example.com/page.html';

if ($htmlFileContents = file_get_contents($htmlFile))
{
    //echo $htmlFileContents; should show the source of the html file
    //attempt to match links
    preg_match_all('/\?source=(.+?)\"/im', $htmlFileContents, $matches, PREG_SET_ORDER);

    if (!empty($matches))
    {
        //print_r($matches); see what you have here
        foreach ($matches AS $match)
        {
            //I believe $match[1] will have the link...
            //use header() to download to client, or grab the file content to write to server
        }
    }
}
It would be something like that. That is the basic structure for what you want. The regular expression may be wrong and I don't know how you want to save the files.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
Post Reply