Page 1 of 1

Weird special character behavior

Posted: Wed Feb 24, 2010 7:23 am
by nbarten
Hi all,

When i make a string in PHP with a special character in it like this:

Code: Select all

 
$myText = "Hi my name is Nicö";
 
and i use the 'echo' command to show it in the browser, the output is like this:

Hi my name is Nicš

This is weird! How to display it properly?

Re: Weird special character behavior

Posted: Wed Feb 24, 2010 7:38 am
by aravona
try using the special entities/characters http://www.utexas.edu/learn/html/spchar.html you o would be &ouml. :)

Re: Weird special character behavior

Posted: Wed Feb 24, 2010 8:11 am
by nbarten
thanks!

The problem however is, that the php code communicates with an iphone application i'm developing, and the iphone function does the encoding in another way.... hmmm how should i say it.

For example, the iphone app replaces spaces ( ' ' ) with '%20'.

Re: Weird special character behavior

Posted: Wed Feb 24, 2010 8:22 am
by aravona
Well thats the same as URL encoding, if you have a page called 'my page.php' in your url it will be 'my%20page.php'

Do you know the actual name of the encoding? If its URL / percent coding then try http://www.w3schools.com/TAGS/ref_urlencode.asp

Its a long list though :)

Re: Weird special character behavior

Posted: Wed Feb 24, 2010 9:35 am
by nbarten
Hmmm.

But isn't it the case then that:

Iphone: Hi my name is Nicö
PHP: Hi my name is Nicö

Isn't it the case then that the iphone app replaces the ö directly with a code (%....), but that the PHP replaces all characters from ö in codes? (So iphone replaces one character, and PHP replaces 6 characters (ö))


Sorry, can't try it out yet, just my logic. What a mess between 2 program languages. :drunk:

Re: Weird special character behavior

Posted: Wed Feb 24, 2010 10:33 am
by Apollo
You mention you have this in your PHP file:
nbarten wrote:$myText = "Hi my name is Nicö";
But this is just how it looks in your editor, and doesn't say anything about what byte(s) are actual written in your .php file to represent that ö character.
PHP files do not have any particular encoding, so it depends on whatever encoding your editor happens to use (and the latter will vary from one editor to another, and may depend on whether you're working on an English, Russian, or Chinese Windows or Mac system, etc).

Furthermore, by echoing that string without explicitly specifying a content type meta header in your HTML output, you are echoing meaningless data. It's like saying "here is a binary dump of my image file" and then a random sequence of bytes, without specifying if it's a GIF, JPG, TIF, BMP, RAW, PSD, or whatever kind of file.

Re: Weird special character behavior

Posted: Fri Feb 26, 2010 3:56 am
by nbarten
Now the problem is, my php code generates a different url encoding compared to my iphone's:

For example, with the character 'é':

PHP: %8E
Iphone: %C3%A9

Quote from http://blogs.sun.com/shankar/entry/how_to_handle_utf_8:
Example: Western browsers send 'é' as '%E9' by default (URL encoding). But when the page is in UTF-8, the browser will first lookup the Unicode multi byte encoding of 'é'. In this case, it are 2 bytes because 'é' lies in UTF code point range 128-256. Those two bytes correspond to à and ©, and will result in '%C3%A9' (URL encoding) in the eventual query string. <form method="post" enctype="application/x-www-form-urlencoded"> is the same as <form method="POST"> and uses the same general principle as GET.
But how can i get the php url encoding also to work with '%C3%A9' instead of '%8E'?

Re: Weird special character behavior

Posted: Fri Feb 26, 2010 4:17 am
by nbarten
correction. In the iphone, i get in my encoding the character 'é' to %E9, which is good.

Now my PHP encoding gives: %C2%8E

Someone know how to get it to %E9?

Re: Weird special character behavior

Posted: Sat Feb 27, 2010 5:57 am
by Apollo
nbarten wrote:Now my PHP encoding gives: %C2%8E
You sure? How do you get this result? Because %C2%8E is utf-8 encoding for the character 'Ž', not 'é'.

But you can use iconv to convert either way:

Code: Select all

$url = '%E9';
 
$url2 = urlencode(iconv('windows-1252','utf-8',urldecode($url))); // $url2 is now '%C3%A9'
 
$url3 = urlencode(iconv('utf-8','windows-1252',urldecode($url2))); // $url3 is now '%E9'

Re: Weird special character behavior

Posted: Mon Mar 01, 2010 7:51 am
by nbarten
I set the 'é' character in a file called 'test.txt', then with PHP i opened the file and read it, and encoded it. This time it worked :D :mrgreen:

Guess it was because the character was directly set in the code... because of the editor used, or macosx, i don't know...