Page 1 of 1

Check multiple files against eachother

Posted: Tue Feb 01, 2005 1:21 pm
by josh
I'm coding a file sync script, it will basically sync the main source of files with multiple other drives, servers, or folders. To check if files are modified I have 3 options

use filesize/last modified
md5_file/sha1_file/other hash or checksum
compare the files bit by bit against eachother

sha1 is out of the question, it is too slow. It takes way too long for a large number of files, md5 is slightly faster however.

bit by bit is definently out of the question.

file size/last modified = fast enough, but not reliable. It is possible to have 2 different files with the same modified date or file size.

Does anyone know a faster way to get a checksum of a file, or a fast reliable method of comparing files?

I will need to sync a large collection of files as quickly as possible. (about 20-50 gigs or more)

Thanks.

Posted: Tue Feb 01, 2005 2:51 pm
by timvw
meaby want to throw a look at http://www.nongnu.org/duplicity/