preg_match_all for html tags

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
tryin2xcel
Forum Newbie
Posts: 3
Joined: Thu Oct 19, 2006 10:48 pm

preg_match_all for html tags

Post by tryin2xcel »

i want to retrieve certain elements bound in same format in an html file.
i have lot much raw data in this html file,
i have stored this in string =$data

and i am looking for such pattern:

Code: Select all

<td width='15%' valign=top>41&nbsp;&nbsp;&nbsp;&nbsp;</td>
among which 41 is desired result.

note: 15% is not constant it can be 15%,25% and 35%

i have many rows like this in html file, and i want to pick the same value which is bound in

<td width='15%' valign=top>
and
</td>
i am trying this code:

Code: Select all

preg_match_all("|<[td width='[0-9]{1}5%' valign=top>]+>(.*)</[/td>]+>|U",$data,$out,PREG_PATTERN_ORDER);
print_r($out[1]);
but i am not getting desired result
can someone please help....

thanx
User avatar
volka
DevNet Evangelist
Posts: 8391
Joined: Tue May 07, 2002 9:48 am
Location: Berlin, ger

Post by volka »

What's <[td ... top>]+> supposed to mean?

try

Code: Select all

<?php
$source = "
	<td width='15%' valign=top>41&nbsp;&nbsp;&nbsp;&nbsp;</td>
	<td width='25%' valign=top>42&nbsp;&nbsp;&nbsp;&nbsp;</td>
	<td width='15%' valign=top>43&nbsp;&nbsp;&nbsp;&nbsp;</td>
	<td width='35%' valign=top>44&nbsp;&nbsp;&nbsp;&nbsp;</td>
	<td width='35%' valign=top>45&nbsp;&nbsp;&nbsp;&nbsp;</td>
	";
	
$pattern = "!<td width='\d5%' valign=top>([^<]+)</td>!";
if ( 0<preg_match_all($pattern, $source, $matches) ) {
	echo "<pre>", print_r($matches[1], true), "</pre>\n";
}
tryin2xcel
Forum Newbie
Posts: 3
Joined: Thu Oct 19, 2006 10:48 pm

Post by tryin2xcel »

thanx a lot
i will test the same in my code
tryin2xcel
Forum Newbie
Posts: 3
Joined: Thu Oct 19, 2006 10:48 pm

Post by tryin2xcel »

will the same logic applies for this pattern..

Code: Select all

$source="search(s):</td><td><em>MY DATA</em>";
i want to catch "MY DATA" string.

Code: Select all

$pattern = "!search(s):</td><td><em>([^<]+)</em>!";
if ( 0<preg_match_all($pattern, $source, $matches) ) {
        echo "<pre>", print_r($matches[1], true), "</pre>\n";
}
actually this is showing me some error, for /td and /em
i shud treat special characters differently i think...
but how...
plz suggest

one more favor is needed, can u plz suggest some easy to learn regex tutorial.
thanx
Post Reply