Singularize/depluralize/inflection

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Singularize/depluralize/inflection

Post by alex.barylski »

I have a database of keywords (about 9K) many of which are semi-redundant, such as:

Code: Select all

 
Consulting
Consultants
Consultation
 
Ideally I want to crunch these down to 'consult' the root word?

Does anyone know of an algorithm (preferably implemented in PHP) which would allow me to convert keywords into root words, so if someone enters 'computer consulting' it will match against 'computers', 'computing', 'consultation', 'consultants' and so forth???
User avatar
arjan.top
Forum Contributor
Posts: 305
Joined: Sun Oct 14, 2007 4:36 am
Location: Hoče, Slovenia

Re: Singularize/depluralize/inflection

Post by arjan.top »

alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Singularize/depluralize/inflection

Post by alex.barylski »

I just found that wiki article...not exactly what I was hoping for :|

http://en.wikipedia.org/wiki/Stemming
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Singularize/depluralize/inflection

Post by Eran »

You can use functions like similar_text(), levenshtein() and soundex() to produce the results you want. Have a look at this article on fuzzy search in PHP:
http://porteightyeight.com/2008/03/07/f ... hp-part-1/
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Singularize/depluralize/inflection

Post by alex.barylski »

pytrin: It turns out what I needed was indeed a stemming function. I'm not doing any of the search in PHP (all done in MySQL via one wicked query a co-worker implemented). I needed a way to find the stem of a word which it does now and works great. :)

Cheers,
Alex
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Singularize/depluralize/inflection

Post by Eran »

what way did you use to find the stem in the end?
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Singularize/depluralize/inflection

Post by alex.barylski »

An implementation I found on the web...looking for link but of course cannot find it...if I remember correctly it was an article in the wiki entry for stemming.
User avatar
arjan.top
Forum Contributor
Posts: 305
Joined: Sun Oct 14, 2007 4:36 am
Location: Hoče, Slovenia

Re: Singularize/depluralize/inflection

Post by arjan.top »

second link in my post, implemented for all the major programming languages, some implemented by Martin Porter himself
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Singularize/depluralize/inflection

Post by alex.barylski »

That was it, yes. Porter Stemming class :)
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Singularize/depluralize/inflection

Post by Eran »

thanks guys, looks interesting
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Singularize/depluralize/inflection

Post by Benjamin »

:arrow: Moved to PHP - Theory and Design
Post Reply