Mysql case insensitive and sensitive column
Moderator: General Moderators
-
keenlearner
- Forum Commoner
- Posts: 50
- Joined: Sun Dec 03, 2006 7:19 am
Mysql case insensitive and sensitive column
Hello, I want to build a dictionary table, with 3 columns, word_id, word, definition. Most of the words are case insensitive but some need to be case sensitive. E.g the word "hobby", if I SELECT the word from the database, it's ok to be either "HoBBy", but there are words that is very important to have case sensitivity such as "mm" means millimeter, "Mm" means megameter, so if my word column is case insensitive, and if I SELECT with the value Mm, the database can return either Mm or mm which is so much difference. So I was thinking of making extra one column, cased_word, just to put those word that is case sensitive. so cased_word and word column are mutually exclusive, both can't have at the same time. But is there a better way than this ? Note that, it's important to have the case sensitive and insensitive word sharing the same word_id, because I will be using their word_id in other tables. Thanks.
- iknownothing
- Forum Contributor
- Posts: 337
- Joined: Sun Dec 17, 2006 11:53 pm
- Location: Sunshine Coast, Australia
you need to set your column to a case sesitive collation.
Eg:
Add
at the end of your SELECT statement,
Eg:
Add
Code: Select all
COLLATE latin1_general_cs-
keenlearner
- Forum Commoner
- Posts: 50
- Joined: Sun Dec 03, 2006 7:19 am
Thanks, but the problem is when I am SELECTing a word, I don't know whether the word is case sensitive or insensitive. such as if user want to lookup a definition by writing the word "hoBBy", this word should be case insensitive, but if I SELECT using the collation COLLATE latin1_general_cs
it will not return the definition because it's stored as "hobby" in the database. The case sensitivity can be determined on the database only.
it will not return the definition because it's stored as "hobby" in the database. The case sensitivity can be determined on the database only.
- iknownothing
- Forum Contributor
- Posts: 337
- Joined: Sun Dec 17, 2006 11:53 pm
- Location: Sunshine Coast, Australia
-
keenlearner
- Forum Commoner
- Posts: 50
- Joined: Sun Dec 03, 2006 7:19 am
I know what you mean, I can even use the binary column to have case sensitive column. But that is not my problem, please get my semantic meaning.
case sensitive word list
WHO
mm
Mm
K
Pa
Hgmm
s
case insensitive word list
who
me
you
I
they
see the first word for both case sensitive and insensitive, the "WHO" means World Health Organization but "who" is a pronoun for asking about a person. Same word but different capital letters gives different meanings. So if user type all lower case "who" I only check the case insensitive column, but if user type "WHo" I must check the case sensitive letter first and if not found then check the case insensitive column. So I want to know is there a way to mix case sensitive and insensitive in one column (i think it's not possible) So what are the alternatives ? Because I want them to share the same primary unique ID. Thanks.
case sensitive word list
WHO
mm
Mm
K
Pa
Hgmm
s
case insensitive word list
who
me
you
I
they
see the first word for both case sensitive and insensitive, the "WHO" means World Health Organization but "who" is a pronoun for asking about a person. Same word but different capital letters gives different meanings. So if user type all lower case "who" I only check the case insensitive column, but if user type "WHo" I must check the case sensitive letter first and if not found then check the case insensitive column. So I want to know is there a way to mix case sensitive and insensitive in one column (i think it's not possible) So what are the alternatives ? Because I want them to share the same primary unique ID. Thanks.
-
keenlearner
- Forum Commoner
- Posts: 50
- Joined: Sun Dec 03, 2006 7:19 am
Thanks, you have indirectly lighten me up. So I would have only one column word which is case insensitive, all case insensitive words will be in lower case, case sensitive words have at least one letter in upper case, so if there is a upper case letter in user lookup word such as "kilOMeter", I will SELECT with the case sensitive collation first and if not found then select again with case insensitive collation. But is there a better way than this ? Thanks
- superdezign
- DevNet Master
- Posts: 4135
- Joined: Sat Jan 20, 2007 11:06 pm
.. A simpler method would be
As long as your search is case-insensitive by default, you'll either get a exact case match or a case-insensitive match.
Code: Select all
SELECT `word` FROM `dictionary` WHERE BINARY `word` = 'foo' OR `word` = 'foo' LIMIT 1;Hm, nice work superdezign. Just one notice - I think you need to specify the ORDER BY caluse in order to be sure which word (case sensitive or case insensitive) is listed first. I.e:
Code: Select all
SELECT `word` FROM `dictionary` WHERE BINARY `word` = 'foo' OR `word` = 'foo' ORDER BY `word` LIMIT 1;There are 10 types of people in this world, those who understand binary and those who don't
- superdezign
- DevNet Master
- Posts: 4135
- Joined: Sat Jan 20, 2007 11:06 pm
As far as I understand, you claim that this SQL query:superdezign wrote:I figured the precedence of the case-sensitivity in the WHERE clause would handle that.
Code: Select all
select id from table where id>10 or id<10Edit: I'm pretty sure you don't mean that but the second part of your topic still confuses me.
superdezign wrote:Ordering it with ORDER BY `word` would seem to me to skew the results, as uppercase characters have a lower ASCII and UNICODE value than lower cased characters.
It is exactly what keenlearner needs.keenlearner wrote:So if user type all lower case "who" I only check the case insensitive column, but if user type "WHo" I must check the case sensitive letter first and if not found then check the case insensitive column.
In second thought, my last query is wrong. It must be:
Code: Select all
SELECT `word` FROM `dictionary` WHERE BINARY `word` = 'foo' OR `word` = 'foo' ORDER BY binary `word` [asc|desc] LIMIT 1;In fact, my query and yours will have the same result.
There are 10 types of people in this world, those who understand binary and those who don't
- superdezign
- DevNet Master
- Posts: 4135
- Joined: Sat Jan 20, 2007 11:06 pm
-
keenlearner
- Forum Commoner
- Posts: 50
- Joined: Sun Dec 03, 2006 7:19 am
Thanks for the solutions, but if it return one row only, how can I differentiate if the word is a case insensitive or case sensitive ? I think I would create the Dictionary table with two word columns, case_sensitive and case_insensitive.
Dictonaray table
Because the problem with your query will be if user looking for the word "mM" (if this should be a case sensitive word) , but this word is not in my database, but your query will still return one row for "mm", which might give wrong definition to the user.
So I would have to search two column for a word, the in_sensitive and sensitive column have the case insensitive and case sensitive collation respectively. So my query will have
so this will not return any result, so not wrong definition given to the user. Or is there a better way than this ? thanks.
Dictionaray table
Dictonaray table
Code: Select all
<pre>
word_id insensitive_word
1 megameter
2 millimeter
3 mm
4 Mm
</pre>So I would have to search two column for a word, the in_sensitive and sensitive column have the case insensitive and case sensitive collation respectively. So my query will have
Code: Select all
SELECT `in_sensitive`, `sensitive` FROM `dictionary` WHERE `sensitive` = 'mM' OR `in_sensitive` = 'mM' LIMIT 1;Dictionaray table
Code: Select all
<pre>
word_id in_sensitive sensitive definition
1 megameter NULL ......
2 millimeter NULL ......
3 NULL Mm SI unit for megameter
4 NULL mm SI unit for millimeter
</pre>
Last edited by keenlearner on Tue Aug 21, 2007 7:05 am, edited 1 time in total.
Hm, now I see that we gave you not the exact solution you want. Maybe this is closer to it:
It's the first row containing the true result. Using your second post word lists you will have:
WHO: WHO, 1
you: you, 2
So 1 is for case sensitive, while 2 is for case insensitive match.
Code: Select all
SELECT `word`, 1 FROM `dictionary` WHERE binary `word` = 'cM'
UNION
SELECT `word`, 2 FROM `dictionary` WHERE LOWER(`word`) = LOWER('cM');
WHO: WHO, 1
you: you, 2
So 1 is for case sensitive, while 2 is for case insensitive match.
There are 10 types of people in this world, those who understand binary and those who don't
-
keenlearner
- Forum Commoner
- Posts: 50
- Joined: Sun Dec 03, 2006 7:19 am
Code: Select all
SELECT IF (`sensitive` != '', `sensisitve`, `in_sensitive`) FROM `dictionary` WHERE `sensitive` = 'mM' OR `in_sensitive` = 'mM';There are 10 types of people in this world, those who understand binary and those who don't