Matching Content

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Matching Content

Post by GeXus »

Say you wanted to take two pages, and determine the percent in which the content of the pages match.. such as x% of page A matches page B.

Does anyone have any incite as to the best way for doing this?
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

How would even begin to consider the logic behind that? How do you determine percent similarity?
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

I dont know.. I would imagine you would have to have maybe a set character length.. so lets say 100.

1. You would count all of the characters from each source
2. Determine if any characters (grouped in order of 100) match from source to source.
3. Get the # that match, and determine the percent...
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

Have fun with that one man... :twisted:
Post Reply