Page 1 of 1

Foreign Language Characters

Posted: Thu Jun 26, 2003 11:48 pm
by ILoveJackDaniels
I need to pass various variables in a URL, and something seems to be happening to them along the way. For example, passing "Français" in the URL, then simply echoing it straight away gives "Fran%C3%A7ais". I tried urldecode, and that printed out "Français".

Does anyone know why these characters are converted in this way, and does anyone know a method for returning them to their proper value once passed through a URL?

Posted: Fri Jun 27, 2003 12:52 am
by qartis
Have you tried urlencode()ing the string before putting it into the url, and then urldecode()ing it back again?

Posted: Fri Jun 27, 2003 4:33 am
by ILoveJackDaniels
I have now :)

Thankyou. Still don't understand what happened to the characters in the first place, but lots of urlencoding and decoding seems to have done the trick.

Posted: Sat Jun 28, 2003 3:51 am
by qartis
It has to do with the UTF value of the character. UTF (Unicode Transformation Format) is a text format whereby every character of every writing system on the planet is represented, but as such, it often takes up to 3 times the space per character to store text in Unicode. Unicode isn't a de facto standard yet, because a lot of applications would rather shorten a 6-byte character into a 1-byte character (ASCII), so a "ç" turns into "ç" when you're editing french unix textfiles in Windows. It's a flaw inherent when any two standards meet - which one is better, and as such, which one prevails?