Searching Within a Column and UTF-8

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
Superman859
Forum Commoner
Posts: 47
Joined: Sun Oct 29, 2006 10:22 am

Searching Within a Column and UTF-8

Post by Superman859 »

I'm adding a Korean-english / english-Korean dictionary to a database but am having one final problem.

I have the English words all in a column, and Korean meanings in another column.

If you type in an English word, the Korean column may hold multiple words that are all the same meaning, all separated by commas. Currently, they are all listed in the same column.

I can do a search on the database WHERE english='word', and the correct row is returned - I can print out any column and everything works just fine.

However, I cannot do a search on the database WHERE korean='korean-word' and get the results to return properly. I'm not sure why either.

When the PHP processes the query, is it no longer in the utf-8 encoding, thus resulting in a wrong answer?

I thought perhaps it was because there were multiple meanings in the same row at first. However, if I search for the full value stored in that Korean column (copy and pasted exactly), and run the search WHERE korean='value', it still does not find the row.

So I have two questions - anyone have any idea why results are never found when searching using Korean text (in utf-8 format)?

And can you search for a value, and if it is found ANYWHERE in a given column (even if it is not the only word in that column), and get a value (maybe return multiple rows?).
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

please put

Code: Select all

$query = "SHOW Variables LIKE 'character%'";
$result=mysql_query($query) or die(mysql_error());
while( $row=mysql_fetch_array($result) ) {
	echo join(': ', $row), "<br />\n";
}
die();
right before <our actual mysql_query, run the script and post the output.
Please post also you original sql query.
Superman859
Forum Commoner
Posts: 47
Joined: Sun Oct 29, 2006 10:22 am

Post by Superman859 »

Done. Here are the results.

Code: Select all

character_set: character_set: latin1: latin1
character_sets: character_sets: latin1 big5 czech euc_kr gb2312 gbk latin1_de sjis tis620 ujis dec8 dos german1 hp8 koi8_ru latin2 swe7 usa7 cp1251 danish hebrew win1251 estonia hungarian koi8_ukr win1251ukr greek win1250 croat cp1257 latin5: latin1 big5 czech euc_kr gb2312 gbk latin1_de sjis tis620 ujis dec8 dos german1 hp8 koi8_ru latin2 swe7 usa7 cp1251 danish hebrew win1251 estonia hungarian koi8_ukr win1251ukr greek win1250 croat cp1257 latin5
While I do not know what this means, I assume it is saying my character set defaults to latin1 for MySQL? I do know that when I open phpMyAdmin, the first screen has an option called Language. It is set to en-utf-8. When I view the tables in phpMyAdmin, the Korean shows up properly. However, before I changed language from the latin iso to en-utf-8, they did not.

I know that if I use the 'insert' tab for phpMyAdmin and type Korean in, it inserts it using unicode format, &#nnnn; It also shows up in that format in the table, regardless of the language setting.

When I added all the words to the database by reading in a file and adding words line by line (using regular expressions), they were added, but they didn't show up in utf-8 format. They were just jibberish characters, similar to what is seen with euc-kr format. This was before changing language to en-utf-8. On en-utf-8, they show up properly (but text put in the database using the input tab still shows the &#nnnn; unicode for text).

Here is the code I added it to (just to make sure I did it right):

Code: Select all

<?php
header('Content-type: text/html; charset=utf-8');
include("../files/config.php");
echo "test<br />";

$query = "SELECT * FROM dictionary WHERE english='A-frame'";
$result = mysql_query($query);
$row = mysql_fetch_row($result);

echo "$row[0] <br /> $row[1] <br /> $row[3]<br />";

$query = "SELECT * FROM dictionary WHERE english='abandon'";
$result = mysql_query($query);
$row = mysql_fetch_row($result);

echo "$row[0] <br /> $row[1] <br /> $row[2] <br /> $row[3]<br />";

$query = "SELECT * FROM dictionary WHERE word_id='3'";
$result = mysql_query($query);
$row = mysql_fetch_row($result);

echo "$row[0] <br /> $row[1] <br /> $row[2] <br /> $row[3]<br />";

$query = "SHOW Variables LIKE 'character%'";
$result=mysql_query($query) or die(mysql_error());
while( $row=mysql_fetch_array($result) ) {
        echo join(': ', $row), "<br />\n";
}
die();

$query = "SELECT * FROM dictionary WHERE korean='a &#54616;&#45208;&#51032;, &#50612;&#45712; &#54616;&#45208;&#51032;, &#50612;&#46500;(a certain), &#44057;&#51008;'";
$result = mysql_query($query);
$row = mysql_fetch_row($result);

echo "$row[0] <br /> $row[1] <br /> $row[2] <br /> $row[3]";
?>
Post Reply