I'm really stuck. Please help me.

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
AlienSD
Forum Newbie
Posts: 3
Joined: Wed May 14, 2003 2:08 pm

I'm really stuck. Please help me.

Post by AlienSD »

I have the script below which looks for all the links on a string($new_str) and try to open each link (in the foreach statement) to get it's content.

I'm stuck on getting each url's content. Please help me.

Code: Select all

<?php

$new_str = 
"
<a href="http://www1.folha.uol.com.br/folha/brasil/ult96u49003.shtml">Link 1</a><br>
<a href="http://www1.folha.uol.com.br/folha/mundo/ult94u56767.shtml">Link 2</a><br>
<a href="http://www1.folha.uol.com.br/folha/dinheiro/ult91u67128.shtml">Link 3</a><br>
<a href="http://www1.folha.uol.com.br/folha/cotidiano/ult95u74774.shtml">Link 4</a><br>
<a href="http://www1.folha.uol.com.br/folha/esporte/ult92u59688.shtml">Link 5</a><br>
<a href="http://www1.folha.uol.com.br/folha/ilustrada/ult90u32944.shtml">Link 6</a><br>
<a href="http://www1.folha.uol.com.br/folha/informatica/ult124u12907.shtml">Link 7</a><br>
<a href="http://www1.folha.uol.com.br/folha/ciencia/ult306u9093.shtml">Link 8</a><br>
<a href="http://www1.folha.uol.com.br/folha/educacao/ult305u12840.shtml">Link 9</a>
";

//Let's say I don't know how many links the string has, so I do this

$abc = preg_match_all("/<a.+?hrefs*=s*([']?)(.+?)*?>/is", $new_str, $match_l);

for($aa=0;$aa<=$abc;$aa++) {
foreach($match_l as $value) {


//Here I extract the "junk" to have only the URL

$str_url = eregi_replace(""","",$value[$aa]);
$str_url = eregi_replace("<a href=","",$str_url);
$str_url = eregi_replace(">","",$str_url);

print "$str_url<br>";


//Here I try to open the URL to get it's content

if(!($File=fopen($str_url,"r"))) 
{ 
echo "There was a problem fetching the content requested."; 
exit; 
} 

while(!feof($File)) 
{ 
$Line.=fgets($File,2000055); 
} 
fclose($File); 

$startN="<!--NOTICIA-->"; 
$endN="<!--/NOTICIA-->"; 
$start_positionN=strpos($Line, $startN); 
$end_positionN=strpos($Line, $endN)+strlen($endN); 
$lengthN=$end_positionN-$start_positionN; 

$Line=substr($Line, $start_positionN, $lengthN); 

echo "<br><br>$Line<BR><BR>";

}
}

?>
The problem is that it only get the first URL's content, when it tries to move on to the next URL, it stops.
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

/s as a regEx-modifier is broken in PHP. You have to use "\r\n" for a that.

So your regEx would look like this

Code: Select all

$abc = preg_match_all("/<a.+?hrefs*=s*(&#1111;']?)(.+?)*?>&#1111;\r\n]*/is", $new_str, $match_l);
Also, bear in mind that preg_match_all returns a two-dimensional array.

http://www.php.net/manual/en/function.p ... ch-all.php - the user-notes are very handy.
AlienSD
Forum Newbie
Posts: 3
Joined: Wed May 14, 2003 2:08 pm

Post by AlienSD »

The expression works fine to get all the url's.

What I can't understand is why the $str_url inside the foreach loop, which is inside the for loop, is not passing the values after the first loop. I can't see anything wrong in the code.
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

I think patrikG is right, it's because of the structure of the matchec-array.
try

Code: Select all

<?php

$new_str =
"
<a href="http://www1.folha.uol.com.br/folha/brasil/ult96u49003.shtml">Link 1</a><br>
<a href="http://www1.folha.uol.com.br/folha/mundo/ult94u56767.shtml">Link 2</a><br>
<a href="http://www1.folha.uol.com.br/folha/dinheiro/ult91u67128.shtml">Link 3</a><br>
<a href="http://www1.folha.uol.com.br/folha/cotidiano/ult95u74774.shtml">Link 4</a><br>
<a href="http://www1.folha.uol.com.br/folha/esporte/ult92u59688.shtml">Link 5</a><br>
<a href="http://www1.folha.uol.com.br/folha/ilustrada/ult90u32944.shtml">Link 6</a><br>
<a href="http://www1.folha.uol.com.br/folha/informatica/ult124u12907.shtml">Link 7</a><br>
<a href="http://www1.folha.uol.com.br/folha/ciencia/ult306u9093.shtml">Link 8</a><br>
<a href="http://www1.folha.uol.com.br/folha/educacao/ult305u12840.shtml">Link 9</a>
";

//Let's say I don't know how many links the string has, so I do this

$abc = preg_match_all("/<a.+?hrefs*=s*([']?)(.+?)*?>/is", $new_str, $match_l);
// $match_l[0] is the array of all "full pattern matches, all other $match_l[N] should have the same amount of elements
foreach($match_l[0] as $key=>$wholematch)
{
	echo '<fieldset><legend>match #', $key, '</legend>';
	foreach($match_l as $pos=>$field)
		echo 'substring #', $pos, '=', htmlentities($field[$key]), '<br />'; // the same as $match[$pos][$key]
	echo '</fieldset>';
}
?>
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

This code below is working

Code: Select all

<?php

$new_str =
"
<a href="http://www1.folha.uol.com.br/folha/brasil/ult96u49003.shtml">Link 1</a><br>
<a href="http://www1.folha.uol.com.br/folha/mundo/ult94u56767.shtml">Link 2</a><br>
<a href="http://www1.folha.uol.com.br/folha/dinheiro/ult91u67128.shtml">Link 3</a><br>
<a href="http://www1.folha.uol.com.br/folha/cotidiano/ult95u74774.shtml">Link 4</a><br>
<a href="http://www1.folha.uol.com.br/folha/esporte/ult92u59688.shtml">Link 5</a><br>
<a href="http://www1.folha.uol.com.br/folha/ilustrada/ult90u32944.shtml">Link 6</a><br>
<a href="http://www1.folha.uol.com.br/folha/informatica/ult124u12907.shtml">Link 7</a><br>
<a href="http://www1.folha.uol.com.br/folha/ciencia/ult306u9093.shtml">Link 8</a><br>
<a href="http://www1.folha.uol.com.br/folha/educacao/ult305u12840.shtml">Link 9</a>
";

preg_match_all("/<a href="(.*)">.*<\/a>[\r\n]*/i", $new_str, $match_l);

foreach($match_l[1] as $key=>$value)
	{
//Here I try to open the URL to get it's content
	
	if(!($File=fopen($value,"r")))
		{
		echo "There was a problem fetching the content requested.";
		exit;
		}
	
	echo "<br><strong>Fetching file #$key: $value</strong><br>";
	while(!feof($File))
		{
		$Line.=fgets($File,2000055);
		}
	fclose($File);
	
	$startN="<!--NOTICIA-->";
	$endN="<!--/NOTICIA-->";
	$start_positionN=strpos($Line, $startN);
	$end_positionN=strpos($Line, $endN)+strlen($endN);
	$lengthN=$end_positionN-$start_positionN;
	
	$Line=substr($Line, $start_positionN, $lengthN);
	
	echo "<br><br>$Line<BR><BR>";
	}

?>
AlienSD
Forum Newbie
Posts: 3
Joined: Wed May 14, 2003 2:08 pm

I Got it

Post by AlienSD »

Thanks for the help but I solved the problem already. Instead of using the foreach, I'm using while and I removed the "for" loop.

Thanks anyway
Post Reply