checking if files exist

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
mistwist
Forum Newbie
Posts: 4
Joined: Sat Nov 13, 2004 1:32 pm
Contact:

checking if files exist

Post by mistwist »

I run a link site, ppl submit their pages(always just a single page at a time) to me to post on my website, but before i post them I want to check for for broken images on that page. I have no way of knowing the name of the images tho they will always be jpg files. i know its possible to do but i am having no luck figuring out how to do it.

any help?
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

Get the page into variable using [php_man]curl[/php_man], then use [php_man]preg_match_all[/php_man] to get all <img> tags and then run curl for each found image to check if all of them produce HTTP 200 OK response.
mistwist
Forum Newbie
Posts: 4
Joined: Sat Nov 13, 2004 1:32 pm
Contact:

Post by mistwist »

nope that won't work.. server does not have curl mod loaded :(

got me all excited...lol
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

Code: Select all

$fp = fopen($url, 'r');
fclose($fp);

print_r($http_response_header);
mistwist
Forum Newbie
Posts: 4
Joined: Sat Nov 13, 2004 1:32 pm
Contact:

Post by mistwist »

rehfeld,
Doesnt t that just return the header info of the url itself? type, server 404, 200 etc

I want to check for broken images on the supplied url...
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

mistwist wrote:rehfeld,
Doesnt t that just return the header info of the url itself? type, server 404, 200 etc

I want to check for broken images on the supplied url...
well you said you didnt have curl, so this is another way to check the headers

just check for 200ok on the image urls
mistwist
Forum Newbie
Posts: 4
Joined: Sat Nov 13, 2004 1:32 pm
Contact:

Post by mistwist »

ok i see where your going... But unfortunately it won't work, this is part of a form and the form does not require the image names to be entered. so it has no way of knowing the image names..
rehfeld
Forum Regular
Posts: 741
Joined: Mon Oct 18, 2004 8:14 pm

Post by rehfeld »

well you have to fetch the page, then get all the image tags, and then check each image

you still need to resolve relative urls into absolute urls so you can check them, which isnt very hard. youll also nee to check the server response, but this should get you going






Code: Select all

<?php


function extract_img_tags($document)
{
    $document = strtolower($document);
    $img_tags = array();
    $pointer = 0;
    
    while (false !== ($open_pos = strpos($document, '<img', $pointer))) {
        $close_pos   = strpos($document, '>', $open_pos) + 1;
        $tag_length  = $close_pos - $open_pos;
        $pointer     = $close_pos;
        $img_tags[]  = substr($document, $open_pos, $tag_length);    
    }

    return $img_tags;
}

function extract_img_url($tag)
{
    $tag = strtolower($tag);
    $url = false;

    if (false !== ($start_pos = strpos($tag, 'src="'))) {
        $url_begin  = $start_pos + 5;
        $url_end    = strpos($tag, '"', $url_begin);
        $url_length = $url_end - $url_begin;
        $url        = substr($tag, $url_begin, $url_length);
    } elseif (false !== ($start_pos = strpos($link, "src='"))) {
        $url_begin  = $start_pos + 5;
        $url_end    = strpos($tag, "'", $url_begin);
        $url_length = $url_end - $url_begin;
        $url        = substr($tag, $url_begin, $url_length);
    }

    return $url;
}






$page = 'http://cnn.com';


$document = @file_get_contents($page);

$img_tags = extract_img_tags($document);

$img_urls = array();
foreach ($img_tags as $tag) {
    $img_urls[] = extract_img_url($tag);
}



print_r($img_urls);





?>
Post Reply