Feyd!
I've fixed the file opening code using file().
I get the following output:
Code: Select all
Array ( ї0] => Array ( ї0] => href="/imghp?hl=en&tab=wi&ie=UTF-8"> ї1] => href="/grphp?hl=en&tab=wg&ie=UTF-8"> ї2] => href="/nwshp?hl=en&tab=wn&ie=UTF-8"> ї3] => href="/options/index.html" ї4] => href=/advanced_search?hl=en> ї5] => href=/preferences?hl=en> ї6] => href=/language_tools?hl=en> ї7] => href="/ads/"> ї8] => href=/services/> ї9] => href=/intl/en/about.html> ї10] => href=http://www.google.co.uk/jobs/> ї11] => href=http://www.google.com/ncr> ) ї1] => Array ( ї0] => href ї1] => href ї2] => href ї3] => href ї4] => href ї5] => href ї6] => href ї7] => href ї8] => href ї9] => href ї10] => href ї11] => href ) ї2] => Array ( ї0] => " ї1] => " ї2] => " ї3] => " ї4] => ї5] => ї6] => ї7] => " ї8] => ї9] => ї10] => ї11] => ) ї3] => Array ( ї0] => /imghp?hl=en&tab=wi&ie=UTF-8 ї1] => /grphp?hl=en&tab=wg&ie=UTF-8 ї2] => /nwshp?hl=en&tab=wn&ie=UTF-8 ї3] => /options/index.html ї4] => /advanced_search?hl=en ї5] => /preferences?hl=en ї6] => /language_tools?hl=en ї7] => /ads/ ї8] => /services/ ї9] => /intl/en/about.html ї10] => http://www.google.co.uk/jobs/ ї11] => http://www.google.com/ncr ) )
With the following code:
Code: Select all
<?php
$filename = $_GETї'url'];
$contents = implode('', file($filename));
// Retrieve all URLs from the HTML
$urls = array( 'href' ); // resolve these attributes from the text
$urls = implode( '|', $urls );
preg_match_all( '#\s+?(' . $urls . ')\s*?=\s*?(ї''"]?)(.*?)\\2ї\s\>]#is', $contents, $matches );
print_r($matches);
?>
This looks good, as it looks correct! However, it looks like a double dimension array, and some of the array elements seem to just store 'href', some only ".
Is there any way to get a straight one-dimensional array with just the HREF="<contents>" <contents> stored in each element?
I'd really appreciate your help on this, as I'll be able to continue with my project.
Cheers
Mark