Page 1 of 1
Is there a similar function - DictionaryBasedBreakIterator
Posted: Fri Sep 05, 2008 2:18 am
by tunyawat
In Java there is a function called DictionaryBasedBreakIterator. The main purpose of this function is to break the word based on a dictionary file.
http://icu-project.org/apiref/icu4j/com ... rator.html
For example, when being used, it can break the word "DictionaryBasedBreakIterator" into "dictionary" "based" "break" "iterator".
Is there a similar function in PHP
Thanks
Re: Is there a similar function - DictionaryBasedBreakIterator
Posted: Fri Sep 05, 2008 2:49 am
by onion2k
No. It should be fairly easy to write your own version by the sounds of it though.
Re: Is there a similar function - DictionaryBasedBreakIterator
Posted: Fri Sep 05, 2008 3:08 am
by starram
You may check for the uppercase words and explode or break the string using that.
Re: Is there a similar function - DictionaryBasedBreakIterator
Posted: Fri Sep 05, 2008 4:32 am
by onion2k
starram wrote:You may check for the uppercase words and explode or break the string using that.
If you read the link the original poster posted you'll see it's used for things like breaking down sentences written in Thai (where they don't use spaces). I doubt it'll always be in nice camel caps. That approach won't work.
Re: Is there a similar function - DictionaryBasedBreakIterator
Posted: Fri Sep 05, 2008 4:49 am
by starram
ok, I just read your Post.
Re: Is there a similar function - DictionaryBasedBreakIterator
Posted: Sat Sep 06, 2008 4:31 am
by tunyawat
onion2k, you are right! I'm going to use this function to break the Thai words. (and there is no cap-lock between words)
what I really want to know is how to make "thisisateststring" into "this is a test string"