extract image and page number from html
Posted: Mon Jul 18, 2011 9:25 am
I need to extract images and corresponding page numbere from php code. please help me .. page number and image is not in the correct order please help me to fix this.
Html code
PHP CODE
OUT PUT
i need only number and image number .
Needed output
Html code
Code: Select all
<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY>
<A name=1></a><IMG src="Dome-Tome1-StephenKing-1_1.jpg"><br>
<IMG src="Dome-Tome1-StephenKing-3_1.jpg">
<hr>
<A name=2></a>© …ditions Albin Michel, 2011<br>
pour la traduction franÁaise<br>
ISBN : 978-2-226-22437-8<br>
<hr>
<A name=3></a>Aux …ditions Albin Michel<br>
D‘ME, tome 1, 2011<br>
<hr>
<A name=4></a><hr>
<A name=5></a>SEL<br>
<hr>
<A name=6></a>1<br>
Les deux femmes ..... etc
<A name=343></a><IMG src="Dome-Tome1-StephenKing-343_1.jpg"><br>
ressemblait tel ement ‡ etc
</BODY>
</HTML>
Code: Select all
<?php
$myFile = 'Dome-Tome1-StephenKings.html';
$content = file($myFile);
// how many lines in this file
$numLines = count($content);
//echo $numLines;
// process each line
for ($i = 0; $i < $numLines; $i++) {
// use trim to remove the carriage return and/or line feed character
// at the end of line
$line = trim($content[$i]);
$re="<a\s[^>]*name\s*=\s*(['\"]??)([^'\">]*?)\\1[^>]*>(.*)<\/a>";
preg_match_all("/$re/siU", $line, $matches);
foreach ($matches[2] as $key=>$value1) {
echo $value1."<br>";
}
preg_match_all('/<img .*src=["|\']([^"|\']+)/i', $line, $matches);
foreach ($matches[1] as $key=>$value2) {
echo $value2."<br>";
}
}
?>
Code: Select all
Dome-Tome1-StephenKing-1_1.jpg
1
Dome-Tome1-StephenKing-3_1
Dome-Tome1-StephenKing-343_1.jpg
343
Needed output
Code: Select all
Dome-Tome1-StephenKing-1_1.jpg
Dome-Tome1-StephenKing-3_1
1
Dome-Tome1-StephenKing-343_1.jpg
343