Page 1 of 2

Getting strange text results back such as \u25cf

Posted: Sun Aug 21, 2011 6:28 pm
by alzika
In some text I'm getting back from using apple's API and lookup.php for getting info about apps, I'm getting back escaped characters like \u25cf. I believe that corresponds to the bullet character. This will all be displayed on a webpage within HTML, so how do I convert these strange characters to what they are actually supposed to be, ie convert that to the actual bullet character.

It's doing this for everything, even single quote characters, I believe.

Re: Getting strange text results back such as \u25cf

Posted: Sun Aug 21, 2011 7:03 pm
by Christopher
That's the way C/C++ (ObjectiveC in this case probably) escapes Unicode characters. That is a bullet. You might just want to use str_replace() to convert all of these to HTML entities.

Code: Select all

$text = str_replace(array('\u25cf'), array('●'), $text);

Re: Getting strange text results back such as \u25cf

Posted: Sun Aug 21, 2011 9:13 pm
by alzika
That was just ONE example out of like 100 different examples of this. Is there a function that takes in text and automatically fixes all of these? I'm seeing these everywhere, not just bullets, but with other characters as well.

Re: Getting strange text results back such as \u25cf

Posted: Sun Aug 21, 2011 9:21 pm
by alzika
A few more examples just looking through real quick over 30 seconds: \u2019 \u201C \u201D \u2013

You can see these yourself in the description with apples JSON results in the lookup: http://itunes.apple.com/lookup?id=403961173

They are all within the description. I'm using json_decode to turn it into an associative array but these odd escaped character are there. I can't account for ALL of the possible characters that could occur. I need to figure out how to fix this.

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 1:00 am
by Christopher
You could use a regular expression to find them and convert them. The conversion could be algorithmic. If you look at the codes: \u25cf becomes the HTML entity ● so if you search for "/\\u.{4}/" you will find these values. If you convert 25cf hex to decimal it is 9679 so "\uxxxx" can be translated to "&#dddd;".

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 10:04 am
by alzika
spam on my thread, lovely.

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 12:10 pm
by Christopher
We try to remove spam as quickly as possible, however we are a no-advertising, all volunteer PHP board ... so reporting spam posts with the [!] button is much more helpful.

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 12:15 pm
by alzika
Will do, thanks. I'd rather take the regex approach to this problem, but which regex function(s) allow me to find all that match the pattern, send that to a custom function which converts the hex to decimal and returns the result and then replaces the matched regex with the returned result?

If you have a solution, let me know. I really appreciate the help.

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 2:19 pm
by alzika
I really don't know if my above post makes sense. I'm really just looking for the regex function I should use to help me convert all instances of the escaped characters to the HTML version (hex to decimal).

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 2:23 pm
by Christopher
I looked at the manual and it seems like preg_replace_callback() is what you want:

http://us.php.net/manual/en/ref.pcre.php

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 3:58 pm
by Weirdan
json_decode() decodes those encoded characters, at least in my version:

Code: Select all

$ext = new ReflectionExtension('json');
var_dump(phpversion(), $ext->getVersion(), json_decode(file_get_contents("http://itunes.apple.com/lookup?id=403961173")));

Code: Select all

string(7) "5.3.7-1"
string(5) "1.2.1"
object(stdClass)#1 (2) {
  ["resultCount"]=>
  int(1)
  ["results"]=>
  array(1) {
    [0]=>
    object(stdClass)#2 (33) {
      ["kind"]=>
      string(12) "mac-software"
      ["artistId"]=>
      int(298910979)
      ["artistName"]=>
      string(17) "Rovio Mobile Ltd."
      ["price"]=>
      float(4.99)
      ["version"]=>
      string(5) "1.6.2"
      ["description"]=>
      string(1807) "Mining and Dining with the burrowing piggies continues! With a bottomless appetite, the bad piggies have burrowed deep in underground caverns to hide the eggs they stole from you. Use the landscape and geology to your advantage to chase the pigs out of their hiding holes, gather rare gems, and retrieve the eggs!

The survival of the Angry Birds is at stake. Dish out revenge on the green pigs who stole the Birds’ eggs. Use the unique destructive powers of the Angry Birds to lay waste to the pigs’ fortified castles. Angry Birds features hours of gameplay, challenging physics-based castle demolition, and lots of replay value. Each of the 270 levels requires logic, skill, and brute force to crush the enemy.

#1 IPHONE PAID APP in US, UK, Canada, Italy, Germany, Russia, Sweden, Denmark, Finland, Singapore, Poland, France, Netherlands, Malta, Greece, Austria, Australia, Turkey, UAE, Saudi Arabia, Israel, Belgium, Norway, Hungary, Malaysia, Luxembourg, Portugal, Czech Republic, Spain, Ireland, Romania, New Zealand, Latvia, Lithuania, Estonia, Nicaragua, Kazakhstan, Argentina, Bulgaria, Slovakia, Slovenia, Mauritius, Chile, Hong Kong, Pakistan, Taiwan, Colombia, Indonesia, Thailand, India, Kenya, Macedonia, Croatia, Macau, Paraguay, Peru, Armenia, Philippines, Vietnam, Jordan, Kuwait and Malta.

#1 IPHONE PAID GAME in more countries than we can count!

AVERAGE REVIEW SCORE for version 1.4.0 = 4.78 / 5.

“Lemme tell ya, these ain’t no ordinary finches we’re talkin’ about. These here are the Angry Birds, the ones that’s gonna kick you in the ‘nads. And they’re the ones on your side. They must be from Galapadapados, or sumptin’.” – Col. Angus, Bird Expert.

Protect wildlife or play Angry Birds!

Features 270 levels, leaderboards, achievements, Facebook and Twitter"
      ["genreIds"]=>
      array(3) {
        [0]=>
        string(5) "12006"
        [1]=>
        string(5) "12201"
        [2]=>
        string(5) "12210"
      }
      ["releaseDate"]=>
      string(20) "2011-01-06T08:00:00Z"
      ["sellerName"]=>
      string(12) "Rovio Mobile"
      ["currency"]=>
      string(3) "USD"
      ["genres"]=>
      array(3) {
        [0]=>
        string(5) "Games"
        [1]=>
        string(6) "Action"
        [2]=>
        string(4) "Kids"
      }
      ["trackId"]=>
      int(403961173)
      ["trackName"]=>
      string(11) "Angry Birds"
      ["releaseNotes"]=>
      string(126) "Major update to v1.6.2:

- 2 new episodes, Ham 'Em High and Mine & Dine!
- A whopping 60 new levels!
- Technical improvements!"
      ["primaryGenreName"]=>
      string(5) "Games"
      ["primaryGenreId"]=>
      int(12006)
      ["wrapperType"]=>
      string(8) "software"
      ["artworkUrl60"]=>
      string(77) "http://a5.mzstatic.com/us/r1000/079/Purple/dd/47/4c/mzi.jbfkhysq.60x60-50.png"
      ["artworkUrl100"]=>
      string(79) "http://a1.mzstatic.com/us/r1000/079/Purple/dd/47/4c/mzi.jbfkhysq.512x512-75.png"
      ["artistViewUrl"]=>
      string(74) "http://itunes.apple.com/us/artist/rovio-mobile-ltd./id298910979?mt=12&uo=4"
      ["contentAdvisoryRating"]=>
      string(2) "4+"
      ["trackCensoredName"]=>
      string(11) "Angry Birds"
      ["trackViewUrl"]=>
      string(65) "http://itunes.apple.com/us/app/angry-birds/id403961173?mt=12&uo=4"
      ["languageCodesISO2A"]=>
      array(1) {
        [0]=>
        string(2) "EN"
      }
      ["fileSizeBytes"]=>
      string(8) "50877419"
      ["screenshotUrls"]=>
      array(5) {
        [0]=>
        string(79) "http://a2.mzstatic.com/us/r1000/080/Purple/a0/eb/f7/mzl.sjosvvup.800x500-75.jpg"
        [1]=>
        string(79) "http://a2.mzstatic.com/us/r1000/062/Purple/ce/c3/6c/mzl.rkgmxfqg.800x500-75.jpg"
        [2]=>
        string(79) "http://a4.mzstatic.com/us/r1000/091/Purple/ef/dc/8e/mzl.gwejdpui.800x500-75.jpg"
        [3]=>
        string(79) "http://a4.mzstatic.com/us/r1000/107/Purple/a9/d3/98/mzl.wfoqpshz.800x500-75.jpg"
        [4]=>
        string(79) "http://a4.mzstatic.com/us/r1000/110/Purple/c7/72/80/mzl.jdcvokag.800x500-75.jpg"
      }
      ["sellerUrl"]=>
      string(20) "http://www.rovio.com"
      ["averageUserRatingForCurrentVersion"]=>
      float(4)
      ["userRatingCountForCurrentVersion"]=>
      int(43)
      ["artworkUrl512"]=>
      string(79) "http://a1.mzstatic.com/us/r1000/079/Purple/dd/47/4c/mzi.jbfkhysq.512x512-75.png"
      ["trackContentRating"]=>
      string(2) "4+"
      ["averageUserRating"]=>
      float(4)
      ["userRatingCount"]=>
      int(2937)
    }
  }
}

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 4:38 pm
by AbraCadaver
Mine too on the command line. Most likely you need to set your page to UTF-8.

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 5:05 pm
by alzika
Really odd. I'm running mine command line too since that's how this program will work and it json_decode is keeping those characters.

I'm using file_get_contents just as in the example code above. How do I set the page to UTF8 and why isn't it already defaulting to that?

Re: Getting strange text results back such as \u25cf

Posted: Mon Aug 22, 2011 6:09 pm
by Weirdan
I'd suggest you to post your code (or simplified version thereof exhibiting the issue) and details on how you're testing it.

Re: Getting strange text results back such as \u25cf

Posted: Tue Aug 23, 2011 5:05 am
by alzika
I'm basically doing this and I see those characters:

Code: Select all


$data = json_decode(file_get_contents("http://itunes.apple.com/lookup?id=403961173"), TRUE);

print_r($data);