cURL & PHP
Posted: Wed Feb 22, 2006 11:36 am
I want to use cURL and PHP to get data from wikipedia. I am able to get the initial file, however, the returned data contains links and i want to retrieve the data to those links.
For example, when the initial page loads, and the user clicks an active hyper-link found in the returned content, i want the script to download the content of the clicked link similar to how it retrieved the initial url "$sourceurl"
Hopefully, this makes sense... I need major help here...
Thanks,
For example, when the initial page loads, and the user clicks an active hyper-link found in the returned content, i want the script to download the content of the clicked link similar to how it retrieved the initial url "$sourceurl"
Hopefully, this makes sense... I need major help here...
Thanks,
Code: Select all
<?php
#if ($word == "([.*]\s[.*])") {$word = str_replace("\s","_", $word); }
$pathfromroot = substr( $_SERVER['REQUEST_URI'], 0, strpos( $_SERVER['REQUEST_URI'], "/" ) );
$default_title = '$query_term'; # If you do not specify a title, this will be your default page.
$title_wiki = $_GET['title'];
$word['$query_term'] = $title_wiki;
#$word = preg_replace("/\s/", "_", $word);
if ($word == "") { $word = $default_title; }
$sourceurl = 'http://en.wikipedia.org/wiki/';
#trim($word);
//$request = $_SERVER['REMOTE_ADDR'];
// FIND BOOKS ON PHP AND MYSQL ON AMAZON
$ch = curl_init(); // initialize curl handle
# This URL changes from time to time. If the script stops working
//suddenly, first check whether Wikipedia.Org changed the URL syntax.
curl_setopt($ch, CURLOPT_URL,$sourceurl); // set url to post to
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable
curl_setopt($ch, CURLOPT_TIMEOUT, 10); // times out after 4s
//curl_setopt($ch, CURLOPT_POSTFIELDS, "url=fileget.php?title=$word"); // add POST fields
curl_setopt($ch, CURLOPT_POST, 1); // set POST method
$buffer = curl_exec($ch); // run the whole process
curl_close($ch);
function callback( $buffer ) {
$nicetitle = str_replace( "_", " ", stripslashes( $title_wiki ) );
global $nicetitle;
global $word;
global $sourceurl;
#trim($word);
# Separate the article content
$buffer = substr( $buffer, strpos( $buffer, '<!-- start content -->' ) );
$buffer = substr( $buffer, 0, strpos( $buffer, '<div class="printfooter">' ) );
if ( $buffer <> '' ) {
$buffer = '<a name="wiki"></a><table style="width:100%" cellspacing=0 cellpadding=1
bgcolor="#EEEEEE" border=0><tr><td>
Wikipedia</td></tr></table><br>' . $buffer;
} else {
$buffer = 'Sorry! We cannot process your request at this time. Please try again later
<br><br><br>';
}
return $buffer;
}
# Your page header comes here...'
ob_start("callback");
$query_term = preg_replace("/\s/", "_", $query_term);
echo $buffer . ($query_term);
ob_end_flush();
# Your page footer comes here...'
?>