cURL and XML/XSL

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
TheMoose
Forum Contributor
Posts: 351
Joined: Tue May 23, 2006 10:42 am

cURL and XML/XSL

Post by TheMoose »

Recently wrote a mini-script to try to fetch some XML data from a remote site. When I used file_get_contents() and cURL functions, both returned transformed XHTML (the XML page I'm requesting has an XSL stylesheet associated to it). Right now I'm using a socket to manually send a request to get the raw XML, as I want to use it for some data processing, other stuff, etc. Is there a way to do this with cURL, or does it process the XSL automatically every time?

On a side note, if I view the actual source code of the page (both FF and IE), it's the XML I want. Why would cURL/file_get_contents() return HTML?
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

This is not possible. Show the URL which generates the XML.
User avatar
TheMoose
Forum Contributor
Posts: 351
Joined: Tue May 23, 2006 10:42 am

Post by TheMoose »

http://armory.worldofwarcraft.com/guild ... =Flash+Mob

EDIT: Check out this test page

The code for the above link is exactly:

Code: Select all

<?php
$output = file_get_contents("http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob");
?>
<textarea rows=20 cols=100>
<? echo htmlspecialchars($output); ?>
</textarea>
[offtopic]
Yes, I know, it's WoW :P. I'm an addict!
[/offtopic]
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

And can you please post your code? I think it it is something with the GET parameters and the url encoding.
User avatar
TheMoose
Forum Contributor
Posts: 351
Joined: Tue May 23, 2006 10:42 am

Post by TheMoose »

What I'm using now to get the raw XML (this works):

Code: Select all

$config['SERVER'] = "Eredar";
$config['GUILD'] = "Flash Mob";
// snip...
$fs = fsockopen("armory.worldofwarcraft.com", 80, $errno, $errstr, 15);
if(!$fs)
    echo "($errno) $errstr";
else
{
    $td = "r=" . urlencode($config['SERVER']) . "&n=" . urlencode($config['GUILD']);
    $outg = "GET /guild-info.xml?$td HTTP/1.0\r\n";
    $outg .= "Host: armory.worldofwarcraft.com\r\n";
    $outg .= "User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)\r\n";
    $outg .= "Connection: close\r\n\r\n";
    fwrite($fs, $outg);
    while (!feof($fs)) {
        $data .= fgets($fs, 128);
    }
    fclose($fs);
}
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

How about yout cURL script which does not return the XML ?
User avatar
TheMoose
Forum Contributor
Posts: 351
Joined: Tue May 23, 2006 10:42 am

Post by TheMoose »

Updated the test link to include both file_get_contents() and cURL methods. I included the header with cURL just to see the result from the request.

Exact code of the test page:

Code: Select all

<?php
$output = file_get_contents("http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob");
?>
file_get_contents():<br>
<textarea rows=20 cols=100>
<? echo htmlspecialchars($output); ?>
</textarea>
<?
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, "http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_HEADER, 1);
$outcurl = curl_exec($curl);
curl_close($curl);
?>
<br>
cURL:<br>
<textarea rows=20 cols=100>
<? echo htmlspecialchars($outcurl); ?>
</textarea>
User avatar
TheMoose
Forum Contributor
Posts: 351
Joined: Tue May 23, 2006 10:42 am

Post by TheMoose »

Any ideas? It's not a pressing issue as I have it working with sockets, just would like to know for future reference.

Thanks for taking a look [s]tho[/s] through Miro.
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

For me it looks fine and should get what you see in "View Source"
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

You have to send a user-agent string or the wow webserver will send the transformed xml->html document.

Code: Select all

$context = stream_context_create(
	array('http'=>array(
			'user_agent'=>'User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'
	))
);
$output = file_get_contents("http://armory.worldofwarcraft.com/guild-info.xml?r=Eredar&n=Flash+Mob", false, $context); 
echo $output;
Post Reply