UTF + French characters

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
grudz
Forum Commoner
Posts: 68
Joined: Thu Dec 04, 2003 12:52 pm

UTF + French characters

Post by grudz »

I am in a rut and I can't get out of it, I've spent all weekend (and most of my Football Sunday) trying to figure out this problem. I have done numerous research, but still nothing.

If you check out this linkhttp://www.mtl-baseline.com/fr/accueil.php

You can see that the french characters are not being displayed properly. I have this in my <head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

The reason i need to use utf-8 and not iso-8859-1 (which would display the characters) is because it would show me weird characters before the song title of the TOP 40 radio that is on my site (which is an include from a radio streaming site). (So it's either having utf-8 but no french characters or having iso-8859-1 bt the weird characters)

i.e --> utf-8 http://www.mtl-baseline.com/fr/accueil.php
iso-8859-1 --> http://www.mtl-baseline.com/fr/accueil2.php

So please, does anybody have any ideas as to fix this....

Thank you

P.S. I wont get back in front of my computer until late tonite so I wont be able to reply right away.

Thank you to those who will read this
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

in 'Français' ç is output as a character.. It is outside the range of "normal" ascii, so is considered an escape character in UTF.. the easy way around it is to convert ç to the correct HTML entity: &ccedil;

using [php_man]htmlentities()[/php_man] on the content of these "weird character" boxes should convert them to usable forms.
grudz
Forum Commoner
Posts: 68
Joined: Thu Dec 04, 2003 12:52 pm

Post by grudz »

ok.....but now if ever i write <br> in my MySQL record (to make a line break) it'll show <br> and not actually make a line break


...but still you have NO idea how this helps.....
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

as I said, the content..

Code: Select all

&lt;?php

	function specialEntityEncode($data)
	{
		if(sizeof($data) == 3)
		{
			return htmlentities($data&#1111;1]);
		}
		else
		{
			return $data&#1111;3] . htmlentities($data&#1111;7]);
		}
	}

	$text = 'Français&lt;test /&gt;Français&lt;a href="blah &gt; asdf" test =    
	quickie jump&gt;Français&lt; table&gt;';

	echo preg_replace_callback('#(^(&#1111;^&lt;]*)?|(&lt;\\s*?&#1111;a-z]+\\s*?(&#1111;a-z]+?\\s*(=\\s*(&#1111;"'']?).*?\\\\6)?\\s*)*?\\s*/?\\s*&gt;)(&#1111;^&lt;]*))#is','specialEntityEncode', $text);

?&gt;

Code: Select all

Fran&amp;ccedil;ais&lt;test /&gt;Fran&amp;ccedil;ais&lt;a href="blah &gt; asdf" test =
        quickie jump&gt;Fran&amp;ccedil;ais&lt; table&gt;
grudz
Forum Commoner
Posts: 68
Joined: Thu Dec 04, 2003 12:52 pm

Post by grudz »

the content.....do you mean in the MySQL database? meaning i have to put &eacute; instead of é in the database?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

no, you don't have to preprocess the content, although that can save time if the page is accessed quite often.

The snippet I wrote can be used to preprocess the content, or just-in-time process it on demand. Just need to be careful if you have entities (instead of the ascii character) in your content.
grudz
Forum Commoner
Posts: 68
Joined: Thu Dec 04, 2003 12:52 pm

Post by grudz »

ok...now i'm getting another problem.... i modified your code to suit my site

Code: Select all

<?php
    function specialEntityEncode($data)    {        
			if(sizeof($data) == 3)        {            
			return htmlentities($data[1]);        
			}else{            
			return $data[3] . htmlentities($data[7]);        
			}    
			}    
			$text = ''.substr($row_nouvelles['text'],0, 275).'...'.'';    
			echo preg_replace_callback('#(^([^<]*)?|(<\s*?[a-z]+\s*?([a-z]+?\s*(=\s*(["'']?).*?\\6)?\s*)*?\s*/?\s*>)([^<]*))#is','specialEntityEncode', $text);
?>
(just the $text was changed)

that works great but when I put it again for another file it gives me this error

Fatal error: Cannot redeclare specialentityencode() (previously declared in /home/mtlbase/public_html/fr/accueil.php:228) in /home/mtlbase/public_html/fr/nouvelles_index.php on line 20

which is pretty straight forwared...i can't put your snippet of code twice on the same page......so how to I make the $text have multiple values?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

you can have it run multiple times, just don't place the function declaration twice.

(the $text set and preg_replace_callback are the running parts of the snippet)
grudz
Forum Commoner
Posts: 68
Joined: Thu Dec 04, 2003 12:52 pm

Post by grudz »

you know what....forget that....i'm just going to put htmlentities() and make sure i don't have any <br> in my records (the index page is the only one that is utf-8 the others are iso so they will show the french characters.....
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

you could change the <br>'s to \n's.. then run an [php_man]nl2br()[/php_man] post htmlentities.. ;)
grudz
Forum Commoner
Posts: 68
Joined: Thu Dec 04, 2003 12:52 pm

Post by grudz »

thank you feyd....it seems like you have an answer for everything....
Post Reply