Page 1 of 1

Preg_match_all

Posted: Tue May 01, 2007 5:06 am
by torvald_helmer
I use preg_match_all to extract content from a html-file, like this:

preg_match_all('#<tr><td class="felt">(.*?)</td><td>#s', $content, $matches);
print_r($matches);

This extract all text inside the table-cells, and this work perfectly!

It return one index of an array like this:
[0] => Postgraduate<br/>

As you see, this index 0 consist of text and a tag.

Sometimes there is additional tag's inside a table-cell defined in the preg_match_all line above...
How can I remove this extra tag, it is mostly a break-tag, but sometime a p-element.

Any tips?? :)

Posted: Tue May 01, 2007 5:21 am
by Adrianc333
Try

echo $matches[1];

Posted: Tue May 01, 2007 5:51 am
by torvald_helmer
This just prints the text, ok. But the tag is still there, just not showing the browser.

I want to remove the tags from the array's entries.

Posted: Tue May 01, 2007 6:26 am
by jamiel
There is probably a nice regex to exclude <br/> in your actualy preg_match ... but you could probably just loop through your results and str_replace <br/> with '' for now if its not too much data.

Posted: Tue May 01, 2007 6:46 am
by superdezign