Page 2 of 4

Re: Special characters Search Problem

Posted: Wed Mar 25, 2009 6:58 am
by Apollo
Then your SQL server's LOWER function does not compare foreign characters with general_ci collation.

Try this query once:

Code: Select all

SET NAMES 'utf8' COLLATE 'utf8_general_ci'
before executing the search queries. Does that make a difference?

If it now works, you can remove the LOWER(..) statements. Because with general_ci collation, SQL considers 'a', 'A', 'À', 'à' and even other variants like '?' and '?' all the same.

Re: Special characters Search Problem

Posted: Thu Mar 26, 2009 3:04 am
by HarPaddy
Thanks for your reply Apollo.

I have executed this query before executing search queries.

"SET NAMES 'utf8' COLLATE 'utf8_general_ci'"

But no records are showing for any search. Even for normal characters also.

I am struggling for long time. Can u please help me soon..

Thnaks,
Paddy

Re: Special characters Search Problem

Posted: Thu Mar 26, 2009 3:34 am
by Benjamin
Can you please post the create table query (you can export it from phpMyAdmin) as well as a dump of some records please.

Re: Special characters Search Problem

Posted: Thu Mar 26, 2009 4:13 am
by HarPaddy
I have dump my table restarant_details and with some records,
please have a look.

CREATE DATABASE `guiacat` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
USE `guiacat`;

CREATE TABLE `restarant_details` (
`restaurant_id` bigint(20) NOT NULL auto_increment,
`rest_names` varchar(100) default NULL,
PRIMARY KEY (`restaurant_id`),
FULLTEXT KEY `rest_name` (`rest_names`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;

--
-- Dumping data for table `restarant_details`
--

INSERT INTO `restarant_details` (`restaurant_id`, `rest_names`) VALUES
(1, 'Rústic'),
(2, 'Celler d''en Miquel'),
(3, 'Els Àngels restaurant'),
(4, 'Casa Balmes');

Re: Special characters Search Problem

Posted: Thu Mar 26, 2009 5:59 am
by Benjamin
Odd, the queries work fine from phpMyAdmin but not when I send them to the server through a terminal. I don't know anything about character encoding so I'm out. Sorry :(

Code: Select all

 
SELECT * FROM restarant_details WHERE rest_names LIKE '%Ã%'
 
In phpMyAdmin that query returns records 1, 3 and 4.

Re: Special characters Search Problem

Posted: Thu Mar 26, 2009 11:02 am
by Apollo
The problem with the casing is no longer encoding related (cause everything is using utf-8 now, the iconv line fixes that). What remains is your connection should use a unicode-aware case insensitive collation, such as utf8_general_ci or utf8_unicode_ci.

If you do this:

Code: Select all

SELECT * FROM restaurant_table WHERE 'A'='a'
You will probably get an empty result?

Do you have phpMyAdmin? If you try the above query there, what do you get, everything or empty? If you look in the main server tab, what is 'MySQL connection collation' set to? When you change it between utf8_bin and utf8_general_ci and try the above query again, do you get different results? (I assume nothing with utf8_bin and everything with utf8_general_ci)

Re: Special characters Search Problem

Posted: Mon Mar 30, 2009 7:36 am
by HarPaddy
Hi Apollo,

Thanks for reply..

I have executed this query with 'utf8_general_ci' collation
SELECT * FROM restaurant_table WHERE 'A'='a'
and getting all the records.

executed same query with 'utf8-bin' collation we got empty result set.

your assumption is correct we got all results with 'utf8_general_ci' and empty set with 'utf8_bin'.

-Harpaddy

Re: Special characters Search Problem

Posted: Mon Mar 30, 2009 8:52 am
by Apollo
Normally the "SET NAMES utf8 COLLATE utf8_unicode_ci" query I mentioned earlier should set the correct collation for your connection.

Can you try this:

Code: Select all

SET NAMES 'utf8'; 
SET collation_connection = 'utf8_unicode_ci'; 
SELECT * FROM restaurant_table WHERE 'A'='a';
Does that result in everything or nothing?

If everything, you should do the first two queries after connecting, and then your search will work as intended.

If nothing, can you replace the last line (the SELECT query) with the following, and post the results here:

Code: Select all

SHOW VARIABLES LIKE 'c%';

Re: Special characters Search Problem

Posted: Mon Mar 30, 2009 9:32 am
by HarPaddy
Hi Apollo,

thanks for reply.

I have executed the show variables '%char%' and changed the character_set_server as 'utf8'

and problem is solved almost.
if i execute this query its giving the result set with case insensitive only
ex:
mysql> SELECT * FROM restarant_details WHERE rest_names like '%àngels%';
+---------------+------------+
| restaurant_id | rest_names |
+---------------+------------+
| 1484 | Àngels |
| 1485 | àngels |
+---------------+------------+

but its giving problem when i take the search word through form submit value
see this:

$query = "SELECT * FROM restarant_details WHERE rest_names like '%àngels%'";
this is giving exact result.

but if search word through form value like
$query = "SELECT * FROM restarant_details WHERE rest_names like '%".$_REQUEST['searchword']."%'";
empty result set is coming for this. I am using character set as ' utf-8' for form page.

Please help in this.

Thanks,
Harpaddy

Re: Special characters Search Problem

Posted: Mon Mar 30, 2009 1:21 pm
by Apollo
HarPaddy wrote:but if search word through form value like
$query = "SELECT * FROM restarant_details WHERE rest_names like '%".$_REQUEST['searchword']."%'";
empty result set is coming for this. I am using character set as ' utf-8' for form page.
Ok, first of all you should NEVER include strings from $_POST, $_GET, $_REQUEST, $_COOKIE, etc. directly in an SQL query (read about SQL injections).
Escape them with mysql_real_escape_string().
Besides, it's better to use $_POST or $_GET explicitly, according to what method you specify, rather than $_REQUEST.

Then, despite of the form using utf-8 encoding, apparently something is still messed up (could be even your webserver, strangely enough, I've seen that before). So include the iconv conversion anyway (which only converts if necessary, and does nothing if the string is already proper utf-8).

So use this: (after using the set names / collation etc from above, which has to be done only once per script, before other queries)

Code: Select all

$s = $_POST['searchword'];
if ($s != iconv('utf-8', 'utf-8', $s)) $s = iconv('windows-1252', 'utf-8', $s);
$query = "SELECT * FROM restarant_details WHERE rest_names LIKE '%".mysql_real_escape_string($s)."%'";
It should now work, no matter whether you type in 'angels' or 'ANGELS' or 'ÅnGeLs', etc in the search form.

Re: Special characters Search Problem

Posted: Tue Mar 31, 2009 1:15 am
by HarPaddy
Thanks for reply..

I have tried for the code..

Empty result set is showing for 'Àngels' and also 'àngels'..
But results are showing for 'angels' and 'Angels'..

I am not getting which makes like this. Please help me
I have tried with form character set 'iso-8859-1' and also 'utf-8'.

Thanks,
HarPaddy

Re: Special characters Search Problem

Posted: Tue Mar 31, 2009 2:30 am
by Apollo
Earlier you said:
HarPaddy wrote:$query = "SELECT * FROM restarant_details WHERE rest_names like '%àngels%'";

this is giving exact result.
Can you get this exact query to work again (with whichever set names / character set you used), and then right after that try the same with $_POST['searchword'] instead of àngels ?

If the first works, but the 2nd doesn't, can you try this: (after submitting àngels in the form)

Code: Select all

$s1 = "àngels";
$s2 = $_POST['searchword'];
$s3 = $s2;
if ($s3 != iconv('utf-8', 'utf-8', $s3)) $s3 = iconv('windows-1252', 'utf-8', $s3);
print("$s1 = ".bin2hex($s1)."<br>$s2 = ".bin2hex($s2)."<br>$s3 = ".bin2hex($s3));
and post the result?

Re: Special characters Search Problem

Posted: Tue Mar 31, 2009 4:16 am
by HarPaddy
Thanks for reply..

I have tried the search with $_POST['searchword'] and also 'angels'

Its not giving any results when i take directly from $_POST['searchword'].
if i give search word as static 'angels' its giving results.

So i have executed the hexadecimal values of those two.

Code:
$s1 = "àngels";
$s2 = $_POST['searchword'];
$s3 = $s2;
if ($s3 != iconv('utf-8', 'utf-8', $s3)) $s3 = iconv('windows-1252', 'utf-8', $s3);
print("$s1 = ".bin2hex($s1)."<br>$s2 = ".bin2hex($s2)."<br>$s3 = ".bin2hex($s3));

Output of this is:
àngels = e06e67656c73
àngels = c3a06e67656c73
àngels = c3a06e67656c73

I think hexadecimal values of static word and post variable value are not same.
what is causing this error?

Thanks,
HarPaddy

Re: Special characters Search Problem

Posted: Tue Mar 31, 2009 5:26 am
by Apollo
Are you sure you did the same set names or set character set query as before, to enforce the connection using utf8_unicode_ci collation?
If not, can you insert that again and retry?

Because looking at the hex values, the first string (the direct 'àngel' string in your php code) is ansi, and the other two are utf-8.

If you're sure you perform the query to set the collation, and you're still not getting results, then somehow your server is messed up and enforcing ansi (probably iso-8859-1 or windows-1252) encoding for some obscure reason. In that case, let's try and reverse it, and do this instead: (assuming 'ansi' in your case means western ansi encoding, which is likely the case)

Code: Select all

// No set names or character set after connection, let's use the default
$s = $_POST['searchword'];
if ($s == iconv('utf-8', 'utf-8', $s)) $s = iconv('utf-8', 'windows-1252', $s); // convert from utf-8 to ansi if necessary
$query = "SELECT * FROM restarant_details WHERE rest_names LIKE '%".mysql_real_escape_string($s)."%'";

Re: Special characters Search Problem

Posted: Tue Mar 31, 2009 8:55 am
by HarPaddy
Thanks for reply..


I have executed query like that.

And i got result set with all possibility search words like ' àngels ' , ' Àngels ' ,' Angels ' and also ' angels '.
Thank you so much.
So , i have reached my requirement .

Thanks a lot.

-HarPaddy