Help badly needed with utf8!

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Perfidus
Forum Contributor
Posts: 114
Joined: Sun Nov 02, 2003 9:54 pm

Help badly needed with utf8!

Post by Perfidus »

I have a big problem trying to fwrite() text to a file because I want it to be in utf8 format and if I use utf8_encode() I get piles of <span style='color:blue' title='I&#39;m naughty, are you naughty?'>smurf</span>.
For example, Euro symbol € is lost, also all the " are suddenly escaped \".
I don't know very much about utf8, but if I write whatever in notepad and save it in utf8 format I get no changes and there are no characters missed.
The php manual has lots of ideas to utf chinese but not spanish.
Any hints?
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Re: Help badly needed with utf8!

Post by Roja »

Perfidus wrote: For example, Euro symbol € is lost,
Lost? Does it convert it to another symbol, or not save it at all?
Perfidus wrote: also all the " are suddenly escaped ".
That could be a few things, although magic_quotes_gpc is the most likely. Do you have it enabled?
Perfidus wrote: I don't know very much about utf8, but if I write whatever in notepad and save it in utf8 format I get no changes and there are no characters missed.
Notepad is *not* utf8 friendly. It will bork utf8 characters on save, you simply haven't used multi-byte characters (like chinese) yet.

For example, paste this text into notepad: "浪达公司是一间在塑料加工行业享有国际声"
You will notice that it does not paste correctly. It does not save correctly. It does not support UTF-8.

So, lets reset back to basics.. show some before and after of text that has been screwed up, maybe show the code producing the error, and we can work on figuring out why its happening.

Also, what you are trying to implement is unicode support. PLEASE read (and yes, its a long article) the entire article by Joel Spolsky (Joel on Software):

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Perfidus
Forum Contributor
Posts: 114
Joined: Sun Nov 02, 2003 9:54 pm

Post by Perfidus »

This article is the BIBLE!
But I will try...
Perfidus
Forum Contributor
Posts: 114
Joined: Sun Nov 02, 2003 9:54 pm

Post by Perfidus »

By the way, this what I want to utf8_encode().
I get the problems I wrote before.
The euro is replaced by an small square.
ñ seems to work...
" become \"

Code: Select all

$FileHandle = fopen("allthetext/mytext.txt", 'w') or die("can't open file");
$stringData = "reference=88596&names=<p align="center">Hombre<br>Araña<br></p>&currency=€;
//$texto=utf8_encode($stringData);
//$texto=iconv('iso-8859-1', 'utf-8', $stringData);//I have tried both methods to encode, same result!!

fwrite($FileHandle, $texto);
fclose($FileHandle);
timvw
DevNet Master
Posts: 4897
Joined: Mon Jan 19, 2004 11:11 pm
Location: Leuven, Belgium

Post by timvw »

I have loaded the http://www.php.net/mbstring extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

Code: Select all

<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
e on Debian/Apache1/PHP4

Code: Select all

<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;

Code: Select all

reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/cmp;lt;br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
re58f57]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/htm] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
refe[url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&l[url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]8f57]
<?php
$fp = fopen('test.txt', 'w');

$str = &quote;reference=88596&amp;names=&lt;p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-typ.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/coamp;lt;/p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]nsion on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€&quote;;
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&ampche2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;curren[url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]P Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url][url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña<br></p>&aming[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]
he1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code:1:c4ac
[php]
<?php
$fp = fopen('test.txt', 'w');

$str = &quote;reference=88596&names=<p align='center'>Hombre<br>Araña<br&gt;&lt;/p&gt;&amp;currency=€&quote;;
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code:1:c4ac858f[url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]

I
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]gn='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]

I always thought$fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre&ampension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url];amp;names=<p align='center'>Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€&quote;;
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&g[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]]
<?php
$fp = fopen('test.txt', 'w');

$str = &quote;reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€&quote;;
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[ring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url][url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]amp;names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre&at.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url][/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]

I al[url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url][url]http://www.php.net/mbstring[/url] extension on WinXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br>Araña<br></p>&currency=€
[/code]

I always thought this was a nice article on unicode ;) [url]http://www.cs.tut.fi/~jkorpela/chars.html[/url]nXP Pro/Apache2/PHP5 and on Debian/Apache1/PHP4

[php]
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
[/code]
reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€&quote;;
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&
<?php
$fp = fopen('test.txt', 'w');

$str = "reference=88596&names=<p align='center'>Hombre<br>Araña<br></p>&currency=€";
fwrite($fp, $str);
fclose($fp);
header('content-type: text/html; charset=UTF-8');
echo $str;
[/php]

[code]
reference=88596&amp;names=&lt;p align='center'&gt;Hombre&lt;br&gt;Araña&lt;br&gt;&lt;/p&gt;&amp;currency=€
I always thought this was a nice article on unicode ;) http://www.cs.tut.fi/~jkorpela/chars.html
Post Reply