Detecting Plagiarism

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Detecting Plagiarism

Post by Benjamin »

I need a way to detect duplicate text posted to a database. It would need to check posted data against anything already posted to detect duplicates. (or partial duplication) I would also need a way to calculate a % of how similar they are.

I'm not sure how to approach this. I considered doing a word count (for each word) for each document, but I'm not sure that would work very well. Any ideas?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

Hmm, looks like the function you mentioned has a 255 character limit, you put me on the right track though. Maybe similar_text() will work.
Post Reply