Page 1 of 1

charset problem when loading data

Posted: Tue Mar 06, 2007 2:36 am
by orlandinho
hi

i have de following problem:

i have a database in utf8-general_ci charset, i have to fill it with some data from other database

in the other database i have tables with data using characters like Ñ and they stored well in database

when i do select * into outfile ......, it creates a file that´s ok

when i try to load this file in my utf8 database i receive a message: "Data too long for column col1"

on the other hand , i have another pc with ubuntu and mysql, when i do select * into outfile. it creates a file similar to the first
but when i do "load data infile ...." with this file it uploads correct data with no problem

i opened both files and they are similar, how can i resolve this :O

Re: charset problem when loading data

Posted: Tue Mar 06, 2007 3:07 am
by volka
orlandinho wrote:i have a database in utf8-general_ci charset, i have to fill it with some data from other database

in the other database i have tables with data using characters like Ñ and they stored well in database
And the data in this "other" database has what character encoding?
orlandinho wrote:when i do select * into outfile ......, it creates a file that´s ok
instead run

Code: Select all

SHOW FULL COLUMNS FROM tablename
What does it say about the collation?

Posted: Tue Mar 06, 2007 10:34 am
by orlandinho
the other database is in utf8 also

collation: ut8

even if they were in different charsets, when i take data into a file, it looks ok
but when i load it i don´t know why it doesn´t recognize the Ñ

when i try to manually insert a Ñ in a row, it goes ok also

Posted: Tue Mar 06, 2007 10:55 am
by volka
When you open the file in a text editor, do you see the character Ñ ?
What text editor do you use? Is it utf8-aware?

Posted: Tue Mar 06, 2007 12:00 pm
by orlandinho
i use notepad

i also had the data in MS SQL, and export it to a text file, in the file there were ok the Ñ s
but when i load data, it doesn´t recognize Ñ

Posted: Tue Mar 06, 2007 10:27 pm
by volka
notepad is afaik not utf-8 aware. Therefore if you see the Ñ and not Ñ the data is not utf8 encoded.

Posted: Tue Mar 06, 2007 11:19 pm
by Kieran Huggins
Editplus supports UTF-8 - check it out.

Posted: Tue Mar 06, 2007 11:53 pm
by volka
Urgs, notepad is utf-8 aware :-S
Windows XP Professional Product Documentation wrote:Notepad allows you to create and open documents in several different formats: ANSI, Unicode, big-endian Unicode, or UTF-8.
So this tells us nothing. Ok, back to square one.

Code: Select all

<?php
$datafile = '...'; // <- enter outfile here
$c = file_get_contents($datafile);

echo 'iso-8859-1: ', strpos($c, chr(209)) ? 'yes':'no', "<br />\n";
echo 'utf-8: ', strpos($c, chr(195).chr(145)) ? 'yes':'no', "<br />\n";
?>
please run this script on your sql outfile (containing at least one Ñ). What does it say about iso/utf?

Posted: Wed Mar 07, 2007 12:01 am
by orlandinho
i used edit+ to open the file and save it as utf
it worked

thanks