searching documents on a server.

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
hawleyjr
BeerMod
Posts: 2170
Joined: Tue Jan 13, 2004 4:58 pm
Location: Jax FL & Spokane WA USA

searching documents on a server.

Post by hawleyjr »

Hello, I have a site in which customers upload documents (Microsoft office docs).

I would like to incorporate a search feature that will allow me to search through the documents.

For instance, if I have a word document on my server that had text containing the phrase; "Sandy went to school at Gonzaga University". And my user typed "Gonzaga" into the search form a link to that document would appear.

I'm assuming this will take more then PHP but any help would be appreciated.
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

I'd advise using a database for that - if you use mySQL, use the fulltext feature there.
Getting the content of MS Word-document into the database, however, is an altogether different matter. You'd need a converter for that - there are commercial packages out there, but the names escape me...
User avatar
hawleyjr
BeerMod
Posts: 2170
Joined: Tue Jan 13, 2004 4:58 pm
Location: Jax FL & Spokane WA USA

Post by hawleyjr »

I thought about this option. However eventually I will need to search PDF, XLS ext. and storing in a mysql is just not the best solution for thes.
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

No, what I meant was that you'd store the plain content of those files (searchable) in the database, while the original files remain in a directory on the server.
User avatar
hawleyjr
BeerMod
Posts: 2170
Joined: Tue Jan 13, 2004 4:58 pm
Location: Jax FL & Spokane WA USA

Post by hawleyjr »

Wouldn't you consider that double entry? Sorry to be the antagonist, I'm just looking for the right solution.
User avatar
patrikG
DevNet Master
Posts: 4235
Joined: Thu Aug 15, 2002 5:53 am
Location: Sussex, UK

Post by patrikG »

No, it's not a double entry. You only use the database to search. Searching through n word documents would take a very considerable amount of time, while most databases are optimised for searching.
User avatar
hawleyjr
BeerMod
Posts: 2170
Joined: Tue Jan 13, 2004 4:58 pm
Location: Jax FL & Spokane WA USA

Post by hawleyjr »

It makes a lot of sense what you are saying. I just need to figure out a way to search through PDF's and other docs...
Post Reply