Nearest Words
Posted: Fri Dec 31, 2010 2:32 am
Hey everyone,
I need to make a "nearest words" function for my site. I am working with PHP and MySQL, and dealing with 100 000+ records. I have a WordID (auto increment) and Word field.
I'll explain what I have so far, which is FAR from ideal:
I have a words table, with a word id, organized in ascending/alphabetical order.
It's quite easy to find the nearest words this way, as the WordID +/- 10 can provide me with the 20 closest words to whatever query.
The problem lies with inserting new data. How can I insert in alphabetical order? Or update my table so that WordIDs stay sequential & Words remain in alphabetical order?
My temporary fix (because I can't come up with anything else, except maybe temp tables..) is to alter the table and order by word ascending, then drop the WordID column, and then re-add it. This meets my requirements but its slow, and it can't be a good idea to be dropping primary keys on the fly like that.
I tried using cursors, but it's also slow.
Any suggestions, ideas, or other ways to approach this problem? I'd love to have something as clean and efficient as the nearest word list urbandictionary has, for example.
Thanks
I need to make a "nearest words" function for my site. I am working with PHP and MySQL, and dealing with 100 000+ records. I have a WordID (auto increment) and Word field.
I'll explain what I have so far, which is FAR from ideal:
I have a words table, with a word id, organized in ascending/alphabetical order.
It's quite easy to find the nearest words this way, as the WordID +/- 10 can provide me with the 20 closest words to whatever query.
The problem lies with inserting new data. How can I insert in alphabetical order? Or update my table so that WordIDs stay sequential & Words remain in alphabetical order?
My temporary fix (because I can't come up with anything else, except maybe temp tables..) is to alter the table and order by word ascending, then drop the WordID column, and then re-add it. This meets my requirements but its slow, and it can't be a good idea to be dropping primary keys on the fly like that.
I tried using cursors, but it's also slow.
Any suggestions, ideas, or other ways to approach this problem? I'd love to have something as clean and efficient as the nearest word list urbandictionary has, for example.
Thanks