Text Diff for wikipedia type feature

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
arpowers
Forum Commoner
Posts: 76
Joined: Sun Oct 14, 2007 10:05 pm
Location: san diego, ca

Text Diff for wikipedia type feature

Post by arpowers »

Hey guys,
I'm trying to put together a wikipedia type feature for a project..

I've gone through pretty much every text diffing script I could find on google, but I haven't found anything easy to use...

wikimedia : 'difference engine' .. very complex -- forget about it.
pear: 'text_diff' .. doesn't seem to work that well, maybe I'm confused.


anybody have suggestions on this?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Why not just use the Wikimedia software? Or is this more of a development exercise for you to build it yourself?
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

I'd use svn + diff parser instead.
User avatar
arpowers
Forum Commoner
Posts: 76
Joined: Sun Oct 14, 2007 10:05 pm
Location: san diego, ca

Post by arpowers »

I will check out SVN and diffparser...

I'm not using wikimedia because I need total control over: presentation and functionality and it is just way too bloated.

Thanks for the replies please keep them coming:)
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

arpowers wrote:I will check out SVN and diffparser...
Basically it will manage all the versioning tasks for you... though interfacing with it could be somewhat hard to implement. There's php_svn extension in pecl - you could try that before resorting to pear's VersionControl_SVN. Though you will still need to parse diffs produced by svn somehow.
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Post by Weirdan »

btw, websvn might be of interest too: http://websvn.tigris.org/
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

User avatar
arpowers
Forum Commoner
Posts: 76
Joined: Sun Oct 14, 2007 10:05 pm
Location: san diego, ca

Post by arpowers »

Thanks again for the replies.. kieren, weirdan

To follow,
I got the pear 'text_diff' working but isn't seeming to do what we want.. just returning changed lines in an array?

I assume some of our colleagues here have built something like this before??

its simply need a class or script that can take to text files, one old & with changes and find the differences between them... on a 'character by character' basis.

Everything I've looked at so far get changed lines.. which in itself can be screwed up if text is added and a new line is created..

as always.. you guys are awesome! thanks for the help
ap

also I don't understand why all the pear stuff needs to be written in php 4 syntax?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Why do you want character-by-character differences?
User avatar
arpowers
Forum Commoner
Posts: 76
Joined: Sun Oct 14, 2007 10:05 pm
Location: san diego, ca

Post by arpowers »

good question..

basically just because I know its been done in the past, and I would like to build a solid system..

lately, though I've been thinking of figuring out the line to line implementation and then figuring the rest out later...

this little feature is more complicated than I initially thought!
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

the unix command bdiff might be what you're looking for.

[url=http://ca.php.net/manual/en/function.shell-exec.php]shell_exec()[url] will give PHP access to the result, though you'll have to create temp files to feed to bdiff.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

The relevant code in MediaWiki is DifferenceEngine, which performs word-by-word diffs and is reasonably isolated from the rest of the system so you should be able to grab it.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Character-by-character would imply binary level comparisons. Subversion (SVN) has this mark done well with its Delta-based algorithm.
User avatar
arpowers
Forum Commoner
Posts: 76
Joined: Sun Oct 14, 2007 10:05 pm
Location: san diego, ca

Post by arpowers »

Wow, finally a really nice solution to this problem...

The 'text_diff' solution from PEAR was the right idea...

Once I implemented this, It wasn't working how I would like, but I found a slightly hacked/modified version that converted the line parameter to a word parameter and rendered it...

output is really nice.
:D :D

http://software.zuavra.net/inline-diff/

for future reference, I don't think the differenceEngine from wikimedia is a good solution to this problem, if you look at their code it is quite long and hard to understand (quickly) ...

the PEAR library is the way to go for the php solution..

Thanks for your help... great support from you guys..
AP
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Update: the inline renderer is now a native part of the Text_Diff PEAR package. You don't need to use the hack presented here anymore. This page is kept for reference only.
So it's built-in? Sweet!
Post Reply