Percent encoding/decoding help

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Martoon
Forum Newbie
Posts: 2
Joined: Tue Dec 08, 2009 5:09 pm

Percent encoding/decoding help

Post by Martoon »

I'm a little confused on percent encoding and decoding on webpages, and not getting expected results.

For example, this Wikipedia article. If you follow that link, the URL you end up with in the browser, and the title of the article, look like "Guiding Light (1960–1969)". However, if I run the following script:

Code: Select all

<?php
$url = "http://en.wikipedia.org/wiki/Guiding_Light_(1960%E2%80%931969)";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$content = curl_exec($ch);
$loc = strpos($content, "<title>") + strlen("<title>");
$locEnd = strpos($content, "</title>", $loc);
$title = substr($content, $loc, $locEnd - $loc);
curl_close($ch);
echo urldecode($url).'<br>';
echo $title.'<br>';
?>
 
The output I get looks like this:

Code: Select all

http://en.wikipedia.org/wiki/Guiding_Light_(1960–1969)
Guiding Light (1960–1969) - Wikipedia, the free encyclopedia
The urldecode() of the URL, and even the text grabbed directly from the title tag in the retrieved HTML, give me some very different characters in place of the long hyphen character I see in the web browser when I go to the URL.

Can someone explain this to me? For example, how would I modify the script above so it echos the proper long hyphens?
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Percent encoding/decoding help

Post by requinix »

You need to change the encoding of your HTML to UTF-8.

Code: Select all

header("Content-Type: text/html; charset=utf-8");

Code: Select all

<meta http-equiv="content-type" content="text/html; charset=utf-8">
Either/both of those should work.
Martoon
Forum Newbie
Posts: 2
Joined: Tue Dec 08, 2009 5:09 pm

Re: Percent encoding/decoding help

Post by Martoon »

Thank you! :D
Post Reply