Page 1 of 1

Usingthe filesystem functions to search docs

Posted: Wed Apr 18, 2007 4:50 am
by darrenedwards
I have a question for all to mull over. The site has an upload feature where documents in several formats (pdf, word, text rtf) are placed into a folder on the server. One feature of the system is to enable the members to search the articles or uploaded documents for keywords or phrases.
So my question is has anyone attempted this and if so how quick was the search etc.

Posted: Wed Apr 18, 2007 7:21 am
by Begby
Don't use the filesystem to search content of files. Its going to be slow as hell. You are going to want to look into indexing the files and then storing the indexes into a database for searching. Thats kind of a big topic, so google for it. Someone else might have some good links they can post.

Posted: Wed Apr 18, 2007 11:36 am
by RobertGonzalez
Searching the content of files gets into DRM (Digital Rights Management) which can ultimately get costly for large scale apps, large files and frequented sites.

I would consider looking into other methods to do what you want to do.

Posted: Wed Apr 18, 2007 11:52 am
by bert4
Google seems pretty good at that :)

My guess is to convert every type of document to a text or html based version and connect these to the original document.

For pdf look here: http://www.searchtools.com/info/pdf.html

And then use something like http://www.phpdig.net/ ?