reading contents of PDF files

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
vangelis
Forum Newbie
Posts: 23
Joined: Thu Jun 05, 2003 6:35 am

reading contents of PDF files

Post by vangelis »

I have a bunch of pdf files and would like to read their contents and store them in a db. Does anyone have any idea on how this can be achieved? (reading the PDF i mean :)
User avatar
twigletmac
Her Royal Site Adminness
Posts: 5371
Joined: Tue Apr 23, 2002 2:21 am
Location: Essex, UK

Post by twigletmac »

May be a better (easier to maintain) idea to just store the location of the PDF file in the database and to keep the actual file data as a PDF file.

Mac
vangelis
Forum Newbie
Posts: 23
Joined: Thu Jun 05, 2003 6:35 am

Post by vangelis »

That sounds like a good idea, but i need to be able to search the text contained in the PDFs.

Maybe there could be a way around it, let's say convert the PDF to txt and and then store it. U think that could be possible?
User avatar
Leviathan
Forum Commoner
Posts: 36
Joined: Tue Sep 23, 2003 7:00 pm
Location: Waterloo, ON (Currently in Vancouver, BC)

Post by Leviathan »

You'd probably have to convert the PDF to text first. I doubt it's at all easy to write code that reads in a PDF file's format and parses it. If you can convert the PDF to text (or some equivalent format), I'd store the text in the database as well as a link to the PDF file, so you can search and then return the file(s) that match.
Post Reply