Search content in multiple file types

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
sporkrunner
Forum Newbie
Posts: 1
Joined: Wed Feb 13, 2008 5:25 pm

Search content in multiple file types

Post by sporkrunner »

Hi there,
With PHP, I am trying to open a series of files (msword, pdf, excel and txt) and search recursively through them returning string matches to a chosen search phrase (similar to how a spider searches sites). I have attempted to use the PHP function 'file_get_contents' but have had little luck reading anything but txt files. Any ideas?
Thanks,
Nathan
User avatar
Popcorn
Forum Commoner
Posts: 55
Joined: Fri Feb 21, 2003 5:19 am

Re: Search content in multiple file types

Post by Popcorn »

you need to read up on file formats. whatever you use to read a file has to know how.

imagine reading an HTML page .... reading character by character of the actual file will start with "<html..." (for simplicity's sake) but what you probably want to find in the file is the result of it being parsed ... the "welcome to my website..." bit.

file_get_contents() only knows how to read plain text files, it does not know to ignore "<html..." and wait for "welcome to my .." for example.

as for actual parsers to read the file formats you mention i dunno. search. it'll be there.
Post Reply