parsing content from html

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
SidewinderX
Forum Contributor
Posts: 407
Joined: Fri Jul 16, 2004 9:04 pm
Location: NY

parsing content from html

Post by SidewinderX »

I am trying to parse the contents of a url. I first use cURL to connect to the url, and dump returned content of curl_exec into a variable [$content]. Then, using sscanf, I try to parse a number from a section of the html. My current code is below:

Code: Select all

<?php
$ch = curl_init();
 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_URL, $url); //$url is defined elsewhere
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 
$content = curl_exec($ch);
curl_close($ch);
 
sscanf($content, '<span id="Stats_lbl16">%s</span>', $xp);
 
echo $xp;
 
?>
The current code displays nothing. I know the sscanf statement is correct as the following code will work:

Code: Select all

$content = '<span id="Stats_lbl16">3244893</span>'; //This line is in the source code of the website I am trying to parse.
sscanf($content, '<span id="Stats_lbl16">%s</span>', $xp);
echo $xp;
Moreover, I am pretty confident the curl bit works fine also. As when I simply echo $content, it will display the page.

I believe the problem is with $content. Something along the lines that > is being converted to > or something. I'm not exactly sure, anyone have any insight on this matter?

Thank you
youscript
Forum Newbie
Posts: 10
Joined: Thu Sep 13, 2007 3:22 am

Re: parsing content from html

Post by youscript »

You can try add

Code: Select all

$content=htmlspecialchars_decode($content)
after

Code: Select all

$content = curl_exec($ch);
curl_close($ch);
SidewinderX
Forum Contributor
Posts: 407
Joined: Fri Jul 16, 2004 9:04 pm
Location: NY

Re: parsing content from html

Post by SidewinderX »

Nope, that doesn't appear to work.
SidewinderX
Forum Contributor
Posts: 407
Joined: Fri Jul 16, 2004 9:04 pm
Location: NY

Re: parsing content from html

Post by SidewinderX »

Solution:

Code: Select all

preg_match('/<span id="Stats_lbl16">\d+<\/span>/', $content, $xp);
Post Reply