CHARACTER SET Problem

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

CHARACTER SET Problem

Post by agriz »

HI,

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

with this meta tag i posted at least 100 rows of data in my mysql table.

But i should have posted them with the UTF-8 charset.

So how should i migrate my table for the UTF-8 charset?

Actually i want to migrate my hole database.

Any help is appreciated...

Thanks in advance...
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: CHARACTER SET Problem

Post by Eran »

This thread should be of help - viewtopic.php?f=14&t=90836
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Re: CHARACTER SET Problem

Post by agriz »

That thread really helped me a lot.

Here i will tell you what i am going to do.

Tell me whether i am right or wrong.

1) after backing up my database, i am going to drop my database (database name for example: user_db) .
2) creating two database. one is old database(user_db_old). another is new one(user_db).
3) New database collation is utf8_unicode_ci.
4)a script to copy all the tables and rows from old to new by converting the result values.
5)results are first converted using mb_convert_encoding($result,'UTF-8','ISO-8859-1') then inserted into new tables.
6)droping the old database.

(I am going to use phpmyadmin for creating and deleting database, tables)

Please tell me whether i am right or wrong?

Thanks
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: CHARACTER SET Problem

Post by Eran »

Personally, I never found mb_convert_encoding to be much help though it might be for you.
Last time I imported between encodings, I followed the following steps which seemed to do the trick:
1. Set the connection encoding and collation to that of the old database using SET CHARACTER SET and SET NAMES
2. Retrieve an entire table into an array
3. Set the connection encoding and collation to that of the new database
4. Insert all the table data into the new database, carefully escaping strings using mysql_real_escape_string (it is aware of the database connection collation).
5. Rinse, repeat for all tables

Sometimes I had to do perform table conversions in batches (rows 1-10000, 10000-20000 etc.) since the script execution time or memory limit would be reached (though if you have control of both, you can avoid this).

You should probably start by converting several rows and see if it works before running it over the entire table.
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Re: CHARACTER SET Problem

Post by agriz »

It is not working to me.

I created one db (name:testing)
then created a table (name:test)

everything is latin1_swedish_ci

create test.php with header iso-88591 charset.

Posted single "Tamil" Unicode sentence. result was stored in table like &#2949;&#2378;....etc...

Now i converted testing(db),test(table) and its field to utf8_unicode_ci.

in a new file (test1.php) i wrote one query like this.

Code: Select all

mysql_connect("localhost","root","");
mysql_select_db("testing");
 
 
$val = mysql_fetch_array(mysql_query("SELECT * FROM test"));
 
$s = iconv("ISO-8859-1","UTF-8",$val['field']);
 
mysql_query("update test set field='".$s."'");
 
But the result is not converted into UTF-8. IT's still in ISO-8859-1

What am i doing wrongly?

(i am using one software to type in TAMIL LANGUAGE which produces results in unicode. I am copying that result and posting into my web page.)
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: CHARACTER SET Problem

Post by Eran »

Did you read my last post..?
You need to change the connection collation and character set depending on the database you are reading / writing to
agriz
Forum Contributor
Posts: 106
Joined: Sun Nov 23, 2008 9:29 pm

Re: CHARACTER SET Problem

Post by agriz »

Sorry to disturb you for long time...

Can you give a very very simple example with one table, one field, one row conversion from latin1 to utf8.

Because i am blindly doing much work by wasting many hours. But i did not solve the problem.

Your simple example will be much helpful.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: CHARACTER SET Problem

Post by Eran »

Code: Select all

 
function selectCharset($charset) { 
     $query = "SET CHARACTER SET " . $charset;
     mysql_query($query);
     $query = "SET NAMES ". $charset;
     mysql_query($query);
}
 
mysql_connect("localhost","root","");
 
selectCharset("latin1"); //Select the charset of the old database
mysql_select_db("iso_db"); //Connect to the old database
 
$query = "SELECT * FROM sometable";
$result = mysql_query($query);
$rows = array();
 
while($row = mysql_fetch_assoc($result) ) { //Loading table data into an array
     $rows[] = $row;
}
 
selectCharset("utf8"); //Switching charset to utf8
mysql_select_db("utf_db"); //Connect to the new utf8 database
 
foreach($rows as $row) { //Inserting old table data in new database
    if(!isset($cols)) {
        $cols = '(' . implode(',',array_keys($row)) . ')';
    }
    $query = "INSERT INTO sometable " . $cols . " VALUES ('" 
        . implode("','", array_map('mysql_real_escape_string', $row)) 
        . "')";
    mysql_query($query);
}
 
Post Reply