Metatag Extracting fopen problems
Posted: Thu Nov 27, 2008 3:13 am
Hi
I'm trying to extract metatags from websites and build a database of these details.
I am new to php usually work in asp so finding things a bit confusing and looking for help.
When the script extracts the metatags it posts them to a database and then moves
to the next url in the list and extracts the metatags for them and so on. When the script cannot extract the metatags for whatever reason the script fails and stops. I want it to realise that it has failed and skip on to the next url and keep going rather than just stop processing completely.
<?php
$url = $_GET["wurl"];
$lid = $_GET["lid"];
$klid = $lid+1;
$fp = fopen( $url, 'r' );
if ($fp)
{
print "Success $lid";
}
else
{
header("Location: http://www.rsslinkexchange.com/metatags1.asp?lid=$klid");
exit;
}
$content = "";
while( !feof( $fp ) ) {
$buffer = trim( fgets( $fp, 4096 ) );
$content .= $buffer;
}
$start = '<title>';
$end = '<\/title>';
preg_match( "/$start(.*)$end/s", $content, $match );
$title = $match[ 1 ];
$metatagarray = get_meta_tags( $url );
$keywords = $metatagarray[ "keywords" ];
$description = $metatagarray[ "description" ];
?>
<html>
<Head>
</head>
<body onLoad="document.links.submit()">
<form name="links" method="post" action="http://www.rsslinkexchange.com/metatags.asp">
<input type="hidden" name="lid" value="<?=$lid?>"/>
<input type="hidden" name="url" value="<?=$url?>"/>
<input type="hidden" name="title" value="<?=$title?>"/>
<input type="hidden" name="description" value="<?=$description?>"/>
</form>
</body>
</html>
When I try to get it to skip on to the next record I get this error message:
Warning: fopen(http://www.homes2uonline.com/) [function.fopen]: failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in D:\hshome\*****\rsslinkexchange.com\grabber3.php on line 6
Warning: Cannot modify header information - headers already sent by (output started at D:\hshome\*****\rsslinkexchange.com\grabber3.php:6) in D:\hshome\*****\rsslinkexchange.com\grabber3.php on line 14
Anyone got suggestions of some code I can use to redirect me from the php code I am using to a different webpage that will allow me pick up the next url on my list and carry on.
Thanks
Tony
I'm trying to extract metatags from websites and build a database of these details.
I am new to php usually work in asp so finding things a bit confusing and looking for help.
When the script extracts the metatags it posts them to a database and then moves
to the next url in the list and extracts the metatags for them and so on. When the script cannot extract the metatags for whatever reason the script fails and stops. I want it to realise that it has failed and skip on to the next url and keep going rather than just stop processing completely.
<?php
$url = $_GET["wurl"];
$lid = $_GET["lid"];
$klid = $lid+1;
$fp = fopen( $url, 'r' );
if ($fp)
{
print "Success $lid";
}
else
{
header("Location: http://www.rsslinkexchange.com/metatags1.asp?lid=$klid");
exit;
}
$content = "";
while( !feof( $fp ) ) {
$buffer = trim( fgets( $fp, 4096 ) );
$content .= $buffer;
}
$start = '<title>';
$end = '<\/title>';
preg_match( "/$start(.*)$end/s", $content, $match );
$title = $match[ 1 ];
$metatagarray = get_meta_tags( $url );
$keywords = $metatagarray[ "keywords" ];
$description = $metatagarray[ "description" ];
?>
<html>
<Head>
</head>
<body onLoad="document.links.submit()">
<form name="links" method="post" action="http://www.rsslinkexchange.com/metatags.asp">
<input type="hidden" name="lid" value="<?=$lid?>"/>
<input type="hidden" name="url" value="<?=$url?>"/>
<input type="hidden" name="title" value="<?=$title?>"/>
<input type="hidden" name="description" value="<?=$description?>"/>
</form>
</body>
</html>
When I try to get it to skip on to the next record I get this error message:
Warning: fopen(http://www.homes2uonline.com/) [function.fopen]: failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in D:\hshome\*****\rsslinkexchange.com\grabber3.php on line 6
Warning: Cannot modify header information - headers already sent by (output started at D:\hshome\*****\rsslinkexchange.com\grabber3.php:6) in D:\hshome\*****\rsslinkexchange.com\grabber3.php on line 14
Anyone got suggestions of some code I can use to redirect me from the php code I am using to a different webpage that will allow me pick up the next url on my list and carry on.
Thanks
Tony