Page 1 of 1

Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 2:05 pm
by eastsidedev
I am starting to work with web page, and I need to programatically get a page's creation date and/or modification date. Would anyone have a code fragment (or a pointer) that can do that.

Regards,
Joseph

Re: Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 2:35 pm
by Benjamin

Code: Select all

 
filemtime('/path/to/file');
 

Re: Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 5:36 pm
by eastsidedev
astions wrote:

Code: Select all

 
filemtime('/path/to/file');
 
This just tells me how to get a file date/time, not how I can get a web page's creation date/time. Note that all I have is the URLs.

Re: Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 5:47 pm
by John Cartwright
Can you be specifc in your meaning of a webpage? Does "webpage" mean individual files on a domain, or the domain itself?

http://ca.php.net/manual/en/function.fi ... .php#73747 can look up the last modified of remote files

Re: Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 8:07 pm
by eastsidedev
John Cartwright wrote:Can you be specifc in your meaning of a webpage? Does "webpage" mean individual files on a domain, or the domain itself?

http://ca.php.net/manual/en/function.fi ... .php#73747 can look up the last modified of remote files
Here's what I mean. I have a bunch of URLS:
http://www.microsoft.com/security/default.mspx
http://www.cisco.com/en/US/products/hw/ ... index.html
etc.....

and I would like to know when these pages were created or modified.

Re: Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 8:22 pm
by John Cartwright
Did you look at the link I gave you?

Re: Getting a URL's publish/change date programatically

Posted: Tue Feb 10, 2009 8:31 pm
by eastsidedev
John Cartwright wrote:Did you look at the link I gave you?
Oops. Need new glasses, for some reason I thought this was part of your signature. I will try the link. Thx.

Re: Getting a URL's publish/change date programatically

Posted: Wed Feb 11, 2009 3:45 pm
by eastsidedev
John Cartwright wrote:Did you look at the link I gave you?
I looked at it and tried it. It does not work, because there is no file to open if I don't know the exact default file name (index.php, index.html, default.html, etc.).

Re: Getting a URL's publish/change date programatically

Posted: Wed Feb 11, 2009 3:59 pm
by John Cartwright
You do not need to specify an actual file, only an http address. Although, thinking about it more it is up to the server to disclose the last modified header. In fact, when I tested it on this forums they did not report the last modified date. You can examine the headers to see for youself.

Code: Select all

function GetRemoteLastModified( $uri )
{
    // default
    $unixtime = 0;
   
    $fp = fopen( $uri, "r" );
    if( !$fp ) {return;}
   
    $MetaData = stream_get_meta_data( $fp );
 
    foreach( $MetaData['wrapper_data'] as $response )
    {
        // case: redirection
        if( substr( strtolower($response), 0, 10 ) == 'location: ' )
        {
            $newUri = substr( $response, 10 );
            fclose( $fp );
            return GetRemoteLastModified( $newUri );
        }
        // case: last-modified
        elseif( substr( strtolower($response), 0, 15 ) == 'last-modified: ' )
        {
            echo $response;
            exit();
            
            $unixtime = strtotime( substr($response, 15) );
            break;
        }
    }
    fclose( $fp );
    return $unixtime;
}
 
echo GetRemoteLastModified('http://forums.devnetwork.net');

Re: Getting a URL's publish/change date programatically

Posted: Wed Feb 11, 2009 4:40 pm
by eastsidedev
Yes, I kept getting a 0.

John Cartwright wrote:You do not need to specify an actual file, only an http address. Although, thinking about it more it is up to the server to disclose the last modified header. In fact, when I tested it on this forums they did not report the last modified date. You can examine the headers to see for youself.

Code: Select all

function GetRemoteLastModified( $uri )
{
    // default
    $unixtime = 0;
   
    $fp = fopen( $uri, "r" );
    if( !$fp ) {return;}
   
    $MetaData = stream_get_meta_data( $fp );
 
    foreach( $MetaData['wrapper_data'] as $response )
    {
        // case: redirection
        if( substr( strtolower($response), 0, 10 ) == 'location: ' )
        {
            $newUri = substr( $response, 10 );
            fclose( $fp );
            return GetRemoteLastModified( $newUri );
        }
        // case: last-modified
        elseif( substr( strtolower($response), 0, 15 ) == 'last-modified: ' )
        {
            echo $response;
            exit();
            
            $unixtime = strtotime( substr($response, 15) );
            break;
        }
    }
    fclose( $fp );
    return $unixtime;
}
 
echo GetRemoteLastModified('http://forums.devnetwork.net');