Page 1 of 1

searching documents on a server.

Posted: Sat Mar 20, 2004 8:16 am
by hawleyjr
Hello, I have a site in which customers upload documents (Microsoft office docs).

I would like to incorporate a search feature that will allow me to search through the documents.

For instance, if I have a word document on my server that had text containing the phrase; "Sandy went to school at Gonzaga University". And my user typed "Gonzaga" into the search form a link to that document would appear.

I'm assuming this will take more then PHP but any help would be appreciated.

Posted: Sat Mar 20, 2004 8:26 am
by patrikG
I'd advise using a database for that - if you use mySQL, use the fulltext feature there.
Getting the content of MS Word-document into the database, however, is an altogether different matter. You'd need a converter for that - there are commercial packages out there, but the names escape me...

Posted: Sat Mar 20, 2004 8:58 am
by hawleyjr
I thought about this option. However eventually I will need to search PDF, XLS ext. and storing in a mysql is just not the best solution for thes.

Posted: Sat Mar 20, 2004 8:59 am
by patrikG
No, what I meant was that you'd store the plain content of those files (searchable) in the database, while the original files remain in a directory on the server.

Posted: Sat Mar 20, 2004 9:19 am
by hawleyjr
Wouldn't you consider that double entry? Sorry to be the antagonist, I'm just looking for the right solution.

Posted: Sat Mar 20, 2004 11:54 am
by patrikG
No, it's not a double entry. You only use the database to search. Searching through n word documents would take a very considerable amount of time, while most databases are optimised for searching.

Posted: Sun Mar 21, 2004 4:32 pm
by hawleyjr
It makes a lot of sense what you are saying. I just need to figure out a way to search through PDF's and other docs...