Page 1 of 1

Encoding problem using CURL to retrieve XML

Posted: Thu Aug 14, 2008 2:21 am
by joebarber
Hi all,

I am working with a web service to which I send a request to a URL with a GET parameter of an XML string. I have been using CURL to get the XML reponse back from this URL, decoding it into a suitable format and then parsing it for use it my website using the simpleXML library as per the code below:

Code: Select all

$url = "http://www.example.com/cgi-bin/d3web_gzip.ssh?XML=$<TCOML>";
$url .= "<ExampleTags></ExampleTags></TCOML>";
 
$c = curl_init($url);
curl_setopt($c, CURLOPT_MUTE, 1);
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
$rawXML = curl_exec($c);
curl_close($c);
 
$fixedupXML = htmlspecialchars($rawXML);
$workableXML = htmlspecialchars_decode($fixedupXML);
$results = simplexml_load_string($workableXML);


This has all been working fine up until recently, when the provider of the web service informed me that they would have to switch the encoding of the XML they send back to me from UTF-8 to ISO-8859-1 as they needed their XML to support a wider character set, foreign characters etc. Unfortunately this seems to have broken the ability for my site to use the XML; after decoding the output and echoing out the XML, it just appears as a string of unreadable characters as below:

Code: Select all

?/?#k?*??-??X??9& :??bj?<?O*\?d ?S?R?Y{o?%BK????}4?|?n?????_!A?@x\???????`????Q?;????I????h?~R?ub??r?Y??w?L??H?>M?Z?W???Hk+???????e?_?Y?;??(??Q> ?ai(??Pf{??=?;g?,?????!af?]??6?=w?4?n??i.????9?|8?)oY4?m4??~is?pG???~??#??gJ????f??L3??JsA\?i> ??????Vx?x+???Q{}FiW>U~?E?}??

I thought that this would be as simple as using one of the UTF-8 encode methods built into PHP on the response, but no luck... I have seemingly tried all the encoding/decoding methods I could find and although it seems to make a difference to the output, it is all still unreadable characters to me. It looks like an encoding problem to me; perhaps I am missing something elsewhere on my page to work with this new encoding?

An example of the XML response I am getting from the web service is given below; the only thing that has changed about it is the encoding attribute from UTF-8.


Code: Select all

<?xml version="1.0" encoding="ISO-8859-1"?>
<TCOML version="3.0" sess="1483650066B.0">
<Availability>
<Line count="1">
<ExampleLine>x</ExampleLine>
<ExampleLine>y</ExampleLine>
<ExampleLine>z</ExampleLine>
</Avline>
</Availability>
</TCOML>
 

Many thanks in advance for taking the time to look at this issue and hopefully help me out with finding a solution!

Joe

Re: Encoding problem using CURL to retrieve XML

Posted: Sun Aug 17, 2008 12:01 pm
by ghurtado
I think you have to specify the character set when using http://www.php.net/htmlspecialchars