[56k Warn]modify existing news system to UTF-8 compatibility

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
derkarsten
Forum Newbie
Posts: 2
Joined: Thu Jun 30, 2005 7:21 am

[56k Warn]modify existing news system to UTF-8 compatibility

Post by derkarsten »

Jcart | If included images please include [56k Warn] in title.

Hi there,

Some time ago I realized a site with content in English and German language. At that time the existing "newswriter" script (http://www.newswriter.info) seemed to be a suitable solution for new the news system.

Now the website should be upgraded with two more languages: Chinese and Japanese. I needed to convert all contents to utf-8 and send the correct headers and all worked very well - but not the news system (which I haven´t wrote).

If I login to the system and write a new article (by simply coopy & paste some contents from a chinese website directly out of a browser window), everything seems to be fine and looks this way:

Image

I already encoded the admin.php (the main file) and all included template-parts to utf-8 and added the accept-charste="utf-8" parameter to the html-forms. In addition, the admin.php file sends a utf-8 header via php and in the included header.php file is the correct http-equiv specification.

But if I just ">> go on" some more times (which displays some more menus, but all within the above mentioned admin.php file) to finally get to the article-preview, I get this:

Image

But this isn´t utf-8, isn´t it? And I don´t know what it is and why the content is displayed this way...

Does anybody has an idea?
I would be so grateful!

Greetings,
Karsten
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Re: modify existing news system to UTF-8 compatibility

Post by Roja »

derkarsten wrote: But this isn´t utf-8, isn´t it? And I don´t know what it is and why the content is displayed this way...
Can't really be sure. You've essentially described each step as having the correct items you need, and obviously, somewhere along the way something is going wrong.

If you have links we could look at, perhaps at least an output page, then we might be able to figure more out. Without the code, links to look at, or anything more, it sounds like you've described everything that is needed to get utf-8 right.
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

Are you storing the news in MySQL? Coz you'll need to be using MySQL 4.1, and you'll need to set the tables, and collation, all to store UTF-8. You'll possibly need to tell MySQL to return UTF-8 data too, which is done by calling mysql_query("SET NAMES 'utf8'); before any query that should get something in UTF8 format.

Also note that a lot of PHP commands need to be replaced with their mb_ multibyte equivalent if you're actually doing anything other than just echo'ing the data.
derkarsten
Forum Newbie
Posts: 2
Joined: Thu Jun 30, 2005 7:21 am

Post by derkarsten »

The script is storing all article informations in simple text files. But I think storing the informations "physically" is the second part of the problem.

At the moment everything mentioned above takes place in only one file named "admin.php". I paste some Chinese signs in a form (with accept-charset="utf-8" option, see first screenshot) and click "go on" four times to get to the article-preview (second screenshot).

Up to this point, any data seems to be stored only in the $_POST[] Array - only when I now click on "Publish this article", the article-data is written to a txt file.

But the point is: what could make the kind of output (second screenshot) out of a well formatted copy&pasted Chinese unicode text directly from my utf-8-charset-accepting html form? Any double encoding? Any php-textwrap-function? I don´t know :(

Do you (or anybody else) have an idea?
Thanks a lot for your time!
Post Reply