cURL and XML/XSL
Moderator: General Moderators
cURL and XML/XSL
Recently wrote a mini-script to try to fetch some XML data from a remote site. When I used file_get_contents() and cURL functions, both returned transformed XHTML (the XML page I'm requesting has an XSL stylesheet associated to it). Right now I'm using a socket to manually send a request to get the raw XML, as I want to use it for some data processing, other stuff, etc. Is there a way to do this with cURL, or does it process the XSL automatically every time?
On a side note, if I view the actual source code of the page (both FF and IE), it's the XML I want. Why would cURL/file_get_contents() return HTML?
On a side note, if I view the actual source code of the page (both FF and IE), it's the XML I want. Why would cURL/file_get_contents() return HTML?
http://armory.worldofwarcraft.com/guild ... =Flash+Mob
EDIT: Check out this test page
The code for the above link is exactly:
[offtopic]
Yes, I know, it's WoW
. I'm an addict!
[/offtopic]
EDIT: Check out this test page
The code for the above link is exactly:
Code: Select all
<?php
$output = file_get_contents("http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob");
?>
<textarea rows=20 cols=100>
<? echo htmlspecialchars($output); ?>
</textarea>Yes, I know, it's WoW
[/offtopic]
What I'm using now to get the raw XML (this works):
Code: Select all
$config['SERVER'] = "Eredar";
$config['GUILD'] = "Flash Mob";
// snip...
$fs = fsockopen("armory.worldofwarcraft.com", 80, $errno, $errstr, 15);
if(!$fs)
echo "($errno) $errstr";
else
{
$td = "r=" . urlencode($config['SERVER']) . "&n=" . urlencode($config['GUILD']);
$outg = "GET /guild-info.xml?$td HTTP/1.0\r\n";
$outg .= "Host: armory.worldofwarcraft.com\r\n";
$outg .= "User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)\r\n";
$outg .= "Connection: close\r\n\r\n";
fwrite($fs, $outg);
while (!feof($fs)) {
$data .= fgets($fs, 128);
}
fclose($fs);
}Updated the test link to include both file_get_contents() and cURL methods. I included the header with cURL just to see the result from the request.
Exact code of the test page:
Exact code of the test page:
Code: Select all
<?php
$output = file_get_contents("http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob");
?>
file_get_contents():<br>
<textarea rows=20 cols=100>
<? echo htmlspecialchars($output); ?>
</textarea>
<?
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, "http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_HEADER, 1);
$outcurl = curl_exec($curl);
curl_close($curl);
?>
<br>
cURL:<br>
<textarea rows=20 cols=100>
<? echo htmlspecialchars($outcurl); ?>
</textarea>You have to send a user-agent string or the wow webserver will send the transformed xml->html document.
Code: Select all
$context = stream_context_create(
array('http'=>array(
'user_agent'=>'User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
))
);
$output = file_get_contents("http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob", false, $context);
echo $output;