replacing umlauts (special chars) in strings

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
maha_x
Forum Newbie
Posts: 2
Joined: Mon Aug 07, 2006 1:08 pm

replacing umlauts (special chars) in strings

Post by maha_x »

Hey boys and girls, I need a hand please!

I'm working on my cousins webpages on my spare time, and have been succesfull mostly. The pages are almost done, except for a minor bugger: the scandinavian umlauts (ä and ö) turn out corrupted. I believe the browser is responsible, cos when I look at the HTML source (in notepad) the umlauts show up just fine. So I tough the safest way would be to replace them umlauts with their HTML codes like ä and ö So I lifted some code from php.net:

Code: Select all

$trans = array('ä' => 'ä', 'Ä' => 'Ä', 'ö' => 'ö', 'Ö' => 'Ö');
$ctmp = strtr($_POST['comment'], $trans);
But this doesn't appear to do anything, the letters still appear in their original form. I also tried a variation:

Code: Select all

$trans = array("ä" => 'ä', "Ä" => 'Ä', "ö" => 'ö', "Ö" => 'Ö');
$ctmp = str_replace(array_keys($trans), $trans, $_POST['comment']);
Without success. Maybe the problem is obvious... Maybe not? I came over from writing C and never really tried to use Finnish with my programs before...

Oh, and just for some background; I pick up the data from a form and simply write the entries into a txt file. And by directly looking at this file I can verify that the umlauts were not changed (also checked that the browser doesn't translate HTML codes when viewing txt files).

help much priciated!
User avatar
MarK (CZ)
Forum Contributor
Posts: 239
Joined: Tue Apr 13, 2004 12:51 am
Location: Prague (CZ) / Vienna (A)
Contact:

Post by MarK (CZ) »

I would suggest using unicode (utf8). Makes working with "non-standard" languages easier.
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

I second that.

And it might already be the "problem". If the form data is sent as utf-8 but the script is save as e.g. iso 8859-1 the 'ä' in the script will not match an ä in the form data. But then there's no need to replace the characters with latin-1 entitites anyway.
maha_x
Forum Newbie
Posts: 2
Joined: Mon Aug 07, 2006 1:08 pm

Post by maha_x »

Now googling "html unicode" produces something usefull, like this:

Code: Select all

<meta http-equiv="content-type" content="text-html; charset=utf-8">
I knew there had to be way to change the coding, I just couldn't google it up... Thanks guys!
User avatar
CoderGoblin
DevNet Resident
Posts: 1425
Joined: Tue Mar 16, 2004 10:03 am
Location: Aachen, Germany

Post by CoderGoblin »

UTF-8 Can cause problems with forms as well ($post/get varaibles) (You may also need to use utf8_decode/encode).

Another solution is to use ISO8859-1 (ISO8859-15 is using the € symbol).

Regards
Post Reply