Page 1 of 1

UTF + French characters

Posted: Sun Oct 17, 2004 7:33 pm
by grudz
I am in a rut and I can't get out of it, I've spent all weekend (and most of my Football Sunday) trying to figure out this problem. I have done numerous research, but still nothing.

If you check out this linkhttp://www.mtl-baseline.com/fr/accueil.php

You can see that the french characters are not being displayed properly. I have this in my <head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

The reason i need to use utf-8 and not iso-8859-1 (which would display the characters) is because it would show me weird characters before the song title of the TOP 40 radio that is on my site (which is an include from a radio streaming site). (So it's either having utf-8 but no french characters or having iso-8859-1 bt the weird characters)

i.e --> utf-8 http://www.mtl-baseline.com/fr/accueil.php
iso-8859-1 --> http://www.mtl-baseline.com/fr/accueil2.php

So please, does anybody have any ideas as to fix this....

Thank you

P.S. I wont get back in front of my computer until late tonite so I wont be able to reply right away.

Thank you to those who will read this

Posted: Sun Oct 17, 2004 8:37 pm
by feyd
in 'Français' ç is output as a character.. It is outside the range of "normal" ascii, so is considered an escape character in UTF.. the easy way around it is to convert ç to the correct HTML entity: &ccedil;

using [php_man]htmlentities()[/php_man] on the content of these "weird character" boxes should convert them to usable forms.

Posted: Mon Oct 18, 2004 12:48 am
by grudz
ok.....but now if ever i write <br> in my MySQL record (to make a line break) it'll show <br> and not actually make a line break


...but still you have NO idea how this helps.....

Posted: Mon Oct 18, 2004 1:32 am
by feyd
as I said, the content..

Code: Select all

&lt;?php

	function specialEntityEncode($data)
	{
		if(sizeof($data) == 3)
		{
			return htmlentities($data&#1111;1]);
		}
		else
		{
			return $data&#1111;3] . htmlentities($data&#1111;7]);
		}
	}

	$text = 'Français&lt;test /&gt;Français&lt;a href="blah &gt; asdf" test =    
	quickie jump&gt;Français&lt; table&gt;';

	echo preg_replace_callback('#(^(&#1111;^&lt;]*)?|(&lt;\\s*?&#1111;a-z]+\\s*?(&#1111;a-z]+?\\s*(=\\s*(&#1111;"'']?).*?\\\\6)?\\s*)*?\\s*/?\\s*&gt;)(&#1111;^&lt;]*))#is','specialEntityEncode', $text);

?&gt;

Code: Select all

Fran&amp;ccedil;ais&lt;test /&gt;Fran&amp;ccedil;ais&lt;a href="blah &gt; asdf" test =
        quickie jump&gt;Fran&amp;ccedil;ais&lt; table&gt;

Posted: Mon Oct 18, 2004 8:53 am
by grudz
the content.....do you mean in the MySQL database? meaning i have to put &eacute; instead of é in the database?

Posted: Mon Oct 18, 2004 11:23 am
by feyd
no, you don't have to preprocess the content, although that can save time if the page is accessed quite often.

The snippet I wrote can be used to preprocess the content, or just-in-time process it on demand. Just need to be careful if you have entities (instead of the ascii character) in your content.

Posted: Mon Oct 18, 2004 12:29 pm
by grudz
ok...now i'm getting another problem.... i modified your code to suit my site

Code: Select all

<?php
    function specialEntityEncode($data)    {        
			if(sizeof($data) == 3)        {            
			return htmlentities($data[1]);        
			}else{            
			return $data[3] . htmlentities($data[7]);        
			}    
			}    
			$text = ''.substr($row_nouvelles['text'],0, 275).'...'.'';    
			echo preg_replace_callback('#(^([^<]*)?|(<\s*?[a-z]+\s*?([a-z]+?\s*(=\s*(["'']?).*?\\6)?\s*)*?\s*/?\s*>)([^<]*))#is','specialEntityEncode', $text);
?>
(just the $text was changed)

that works great but when I put it again for another file it gives me this error

Fatal error: Cannot redeclare specialentityencode() (previously declared in /home/mtlbase/public_html/fr/accueil.php:228) in /home/mtlbase/public_html/fr/nouvelles_index.php on line 20

which is pretty straight forwared...i can't put your snippet of code twice on the same page......so how to I make the $text have multiple values?

Posted: Mon Oct 18, 2004 12:35 pm
by feyd
you can have it run multiple times, just don't place the function declaration twice.

(the $text set and preg_replace_callback are the running parts of the snippet)

Posted: Mon Oct 18, 2004 12:39 pm
by grudz
you know what....forget that....i'm just going to put htmlentities() and make sure i don't have any <br> in my records (the index page is the only one that is utf-8 the others are iso so they will show the french characters.....

Posted: Mon Oct 18, 2004 12:43 pm
by feyd
you could change the <br>'s to \n's.. then run an [php_man]nl2br()[/php_man] post htmlentities.. ;)

Posted: Mon Oct 18, 2004 12:59 pm
by grudz
thank you feyd....it seems like you have an answer for everything....