Stop words with fulltext search
Posted: Wed Jan 31, 2007 8:17 am
Hi guys,
Yesterday, I asked about fulltext search problem. Fortunately there were some guys who helped me to solve it but now I faced another problem(It was mentioned in my previous post but although I thought I would be able to find a solution it was not the case.). I am talking about Stop Words. I was told yesterday not to use them but I am developing a news portal in which users are allowed to submit their own posts. After I have read the list provided by MySQL which contains all stop words I got very frustrated.
The problem is that if a user submits this title("One Minister") it contains "One" which is a stop word so when the search engine tries to find a solution the result will not be very accurate. This led me to the idea of creating a second table which contains word equivallents and is used for replacing for example "One" with (There are some options:Just a mismatch, digital representaion, and a cryptic value).
This is what I mean:
Real table:
id|news|user_id
1|One Minister|3
Other table
real|unreal
One|EnO or 15,14,5 or cryptic value
I am unsure whether this is the-best solution that is why I am writing here. Why MySQL introduced these words is another thing that is weird for me?
Yesterday, I asked about fulltext search problem. Fortunately there were some guys who helped me to solve it but now I faced another problem(It was mentioned in my previous post but although I thought I would be able to find a solution it was not the case.). I am talking about Stop Words. I was told yesterday not to use them but I am developing a news portal in which users are allowed to submit their own posts. After I have read the list provided by MySQL which contains all stop words I got very frustrated.
The problem is that if a user submits this title("One Minister") it contains "One" which is a stop word so when the search engine tries to find a solution the result will not be very accurate. This led me to the idea of creating a second table which contains word equivallents and is used for replacing for example "One" with (There are some options:Just a mismatch, digital representaion, and a cryptic value).
This is what I mean:
Real table:
id|news|user_id
1|One Minister|3
Other table
real|unreal
One|EnO or 15,14,5 or cryptic value
I am unsure whether this is the-best solution that is why I am writing here. Why MySQL introduced these words is another thing that is weird for me?