decode &#x

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
cataIin
Forum Newbie
Posts: 8
Joined: Fri Sep 13, 2013 11:06 am
Location: localhost

decode &#x

Post by cataIin »

I use file_get_contents to get contents from external address. Then some strstr, substrs and strpos to get only what I want from retrieved text. Also, I use strip_tags and $final = preg_replace('/\s+/', ' ', $strip_tags);. All good, but I get:

Code: Select all

Bill Clinton, George W. Bush, and Tony Blair. The setting was elegant—the.
Now I need to decode characters like

Code: Select all

 
,

Code: Select all

—
and so on. So I tried with:

Code: Select all

$decode = htmlspecialchars_decode($final);
Same resut. Where is the problem? :?
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: decode &#x

Post by requinix »

htmlspecialchars_decode() only reverses what htmlspecialchars() could do: less-than, greater-than, and quotes. You want html_entity_decode.
User avatar
cataIin
Forum Newbie
Posts: 8
Joined: Fri Sep 13, 2013 11:06 am
Location: localhost

Re: decode &#x

Post by cataIin »

requinix wrote:htmlspecialchars_decode() only reverses what htmlspecialchars() could do: less-than, greater-than, and quotes. You want html_entity_decode.
Thank you for reply. Unfortunately, my problem was not solved.
I use:

Code: Select all

$url = @file_get_contents('http://some.address.com');
$start = strstr($url, '<body>');
$end = substr($start, 0, strpos($start, '</body>'));
$remove_tags = strip_tags($end);
$remove_spaces = preg_replace('/\s+/', ' ', $remove_tags);
$text = html_entity_decode($remove_spaces);
var_dump($text);
and what I get (just an example):

Code: Select all

Clinton&#x2019;s presidency Clinton&#x2019;s . &#x201C;The
Where I'm wrong?
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: decode &#x

Post by requinix »

You need to specify an encoding that can support the characters you're trying to decode. Like UTF-8.

Code: Select all

$text = html_entity_decode($remove_spaces, ENT_QUOTES, "UTF-8");
User avatar
cataIin
Forum Newbie
Posts: 8
Joined: Fri Sep 13, 2013 11:06 am
Location: localhost

Re: decode &#x

Post by cataIin »

requinix wrote:You need to specify an encoding that can support the characters you're trying to decode. Like UTF-8.

Code: Select all

$text = html_entity_decode($remove_spaces, ENT_QUOTES, "UTF-8");
Many thanks! :)
Post Reply