Page 1 of 1

How to convert MSWord-Documents to Text?

Posted: Tue May 06, 2003 10:00 am
by patrikG
The task is to create a search-engine which would trawl through ASCII-text, HTML, and MSWord-documents. No problem with HTML and text, of course, but MSWord creates headaches (not that that would be a surprise :P).

I need to convert probably a couple of thousand MSWord-documents into text.

Does anyone know of a class or algorithm to do it?