Page 1 of 1

extract text from html-file

Posted: Mon Apr 30, 2007 3:55 pm
by torvald_helmer
I have a html-file, which I want to extract the some content, and not any tag's.

I thought I might start like this:
$file = file("test.html");
foreach ($file as $line) {

some code....

}

A example is this line is the file:
<tr><td class="felt">Car</td><td>Mercedes<br/></td></tr>

How can I get just 'Car' and 'Mercedes' from this line?

Has anyone got an idea? I really don't know where to go further...
Need help!

Posted: Mon Apr 30, 2007 4:08 pm
by RobertGonzalez
Regular expressions. This has been asked around here before. Try searching these forums for 'scraping' or some combination of 'getting content from web page'.