Page 1 of 1

Text extract from html table

Posted: Wed Apr 09, 2008 9:11 pm
by sooty77
I need to extract the table contents from this html code into an array. I've been trying to use preg_match_all but without success and could use some clues. Thanks!

Code: Select all

 <TR>
          <TD  colspan="3" height=9 ALIGN="left"><FONT SIZE=2>&nbsp&nbsp text1</TD></FONT>
          <!--<TD  colspan="4" height=9 ALIGN="center"><FONT SIZE=2>#Date#</TD></FONT>-->
          <TD  colspan="1" height=9 ALIGN="center"><FONT SIZE=2>Apr 10, 00:02</TD></FONT>
          <TD  colspan="1" height=9 ALIGN="center"><FONT SIZE=2>3.9&deg C.</TD></FONT>
          <TD  colspan="1" height=9 ALIGN="center"><FONT SIZE=2>5.6&deg C.</TD></FONT>
          <TD  colspan="2" height=9 ALIGN="center"><FONT SIZE=2>Moist</TD></FONT>
          <TD  colspan="1" height=9 ALIGN="center"><FONT SIZE=2>87%</TD></FONT>
          <TD  colspan="1" height=9 ALIGN="center"><FONT SIZE=2>0.8 km/h</TD></FONT>
          <TD  colspan="3" height=9 ALIGN="center"><FONT SIZE=2>North-westerly</TD></FONT>
          <TD  colspan="3" height=9 ALIGN="center"><FONT SIZE=2>No recent rainfall</TD></FONT>
      </TR>

Re: Text extract from html table

Posted: Thu Apr 10, 2008 5:23 am
by aceconcepts
Why don't you store it as an array in the first place?

or

Is it already stored as html?

Re: Text extract from html table

Posted: Thu Apr 10, 2008 1:39 pm
by aCa

Code: Select all

preg_split('/<[^.]+?>\n*/', $table, null, 1);
This code will return all the values in an array. But it also include --> and some arrays with only blank spaces.

Code: Select all

preg_replace('/<!--.+-->/', '', $tabe);
If you run this first you will get rid of the -->

You can then just ignore the array values that are empty or just do a replace with \s to get rid of spaces before you run the split.

Hope this helps.

Re: Text extract from html table

Posted: Thu Apr 10, 2008 4:02 pm
by sooty77
Great, thanks for the help. Pretty cool regex tool you have there, I'll using that :)