Page 1 of 1

SimpleXMLElement failure workaround

Posted: Fri Sep 16, 2011 10:39 am
by mwpclark
Hello

I have a script that uses SimpleXMLElement to access an API from an external provider. The same script is used on many websites to provide real-time content.

It works well almost all the time, however when the external provider has an occasional problem, it prevents my page from loading properly, and causes server overload because multiple httpd processes never finish.

The code in question is

Code: Select all

 $pFile = new SimpleXMLElement('http://api.XXX.com/a/XXX-api/xml-v2/ws-'.$number.'/q-'.$terms.'?pshid=XXX&ssty=1&cflg=r', null, true);
For example, last night for about 2 hours the API access string typed into a browser produced the following error:

502 Bad Gateway
nginx

I have tried entering the following snippet into the script but it didn't work:

Code: Select all

libxml_use_internal_errors(true);
if (!$pFile) {
    echo "Failed loading XML\n";
    foreach(libxml_get_errors() as $error) {
        echo "\t", $error->message;
    }
}
Any suggestions will be much appreciated.

Thanks
Mike

Re: SimpleXMLElement failure workaround

Posted: Fri Sep 16, 2011 1:33 pm
by Christopher
You may want to look into passing SimpleXML some flags to suppress/avoid errors:

http://www.php.net/manual/en/libxml.constants.php

If that fails, you might want to try using something like cURL to check it the URL is active first before fetching the XML.

Re: SimpleXMLElement failure workaround

Posted: Sun Sep 18, 2011 9:15 pm
by mwpclark
Thanks Christopher

Looks like curl might work, I won't really know unless/until the api fails again, but tests are positive.

Using the no body option seems to avoid double-loading:

Code: Select all

curl_setopt($curl, CURLOPT_NOBODY, true);
Cheers
Mike

Re: SimpleXMLElement failure workaround

Posted: Sun Sep 18, 2011 9:46 pm
by Weirdan
mwpclark wrote: Using the no body option seems to avoid double-loading:
Actually you shouldn't be requesting the api twice (because of possible side effects, but also because it still may fail the second time you requesting even if it didn't fail the first time).

So instead you need to request data with curl and pass the data to simplexml_load_string(), after checking that it's actually not a fail response.

Re: SimpleXMLElement failure workaround

Posted: Thu Sep 22, 2011 10:26 am
by mwpclark
Thanks Wierdan

I understand the concept. I have submitted a request to the provider's tech guys, and am awaiting their reply. Regarding second-time failure, the real problem is systemic failure of their server which results in hundreds of my server's httpd processes not finishing.

Here is the code I am using in one test situation. As you say, it has the disadvantage of calling the API twice. It is in place on one site and seems to work. It *should* return the default negative if the API site is broken. The only real test I have run has been to alter the API url to simulate the error condition. It performs correctly when the API is working, and also when the site cannot be reached. Remains to be seen what happens if/when the source breaks again...

Code: Select all

function remoteFileExists($url) {
    $curl = curl_init($url);

    //don't fetch the actual page, you only want to check the connection is ok
    curl_setopt($curl, CURLOPT_NOBODY, true);

    //do request
    $result = curl_exec($curl);

    $ret = false;

    //if request did not fail
    if ($result !== false) {
        //if request was ok, check response code
        $statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        if ($statusCode == 200) {
            $ret = true;
        }
    }

    curl_close($curl);

    return $ret;
}
$exists = remoteFileExists('http://api.XXX.com/a/XXX-api/xml-v2/ws-'.$number.'/q-'.$terms.'?pshid=XXX&ssty=1&cflg=r');

if (!$exists) {
    echo 'no jobs returned, please try later';
} else {
    
    $c = 0;
    do {
     $pFile = new SimpleXMLElement('http://api.XXX.com/a/XXX-api/xml-v2/ws-'.$number.'/q-'.$terms.'?pshid=XXX&ssty=1&cflg=r', null, true); 

Re: SimpleXMLElement failure workaround

Posted: Thu Sep 22, 2011 5:56 pm
by Weirdan
But why would you do that when you could do this?:

Code: Select all

function getRemoteFile($url) {
    $curl = curl_init($url);

    //fetch the actual page
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

    //do request
    $result = curl_exec($curl);
    if ((false !== $result) && 200 != curl_getinfo($curl, CURLINFO_HTTP_CODE)) {
         $result = false;
    }
    curl_close($curl);
    return $result;
}
$response = getRemoteFile('http://api.XXX.com/a/XXX-api/xml-v2/ws-'.$number.'/q-'.$terms.'?pshid=XXX&ssty=1&cflg=r');

if (false === $response) {
    echo 'no jobs returned, please try later';
} else {   
     $pFile = new SimpleXMLElement($response); 
}

Re: SimpleXMLElement failure workaround

Posted: Thu Sep 22, 2011 7:01 pm
by mwpclark
Thanks for the suggestion, it works fine when the api is working.

However, when I change the url to a non-existent location to approximate a fail condition, the simplexml script does not return the default error message, and also seems to break page load and stop every include coming after it.