Page 1 of 1

match content between two html tags based on id

Posted: Sat Dec 13, 2008 5:55 pm
by SidewinderX
Say I am given this string:

Code: Select all

<span id="some_id">some_data</span>

I want to match some_data. I can easily use the expression

Code: Select all

/<span id="some_id">(.*)<\/span>/is

and generalize it to

Code: Select all

/<span id="$id">(.*)<\/span>/is
.

However, I would like to generalize it even further, to the point, it ignores the "span" and any other other attributes that would be provided. In other words, something like this expression does the job

Code: Select all

preg_match("/<(.*)id=\"$id\"(.*)>(.*)<\/(.*)>/is", $in, $out);

Suffice to say, this has two issues, 1) it is matching and returning data that I do not want - I only want some_data, $out[3]. $out[1], $out[2], and $out[4] are unneeded overhead, and 2) if some_data happens to contain a </tag> that tag will be matched returning incorrect data. I think the latter issue can be solved using a named capture(?)/backreference, but I am not sure how that would work despite my reading regular-expressions.info/named.html

Would some regex wiz enlighten me as to how to solve the two issues outlined above?

Thank you

Re: match content between two html tags based on id

Posted: Sat Dec 13, 2008 6:24 pm
by SidewinderX
Got it.

Code: Select all

preg_match("/<([\w]+)[^>]+id=\"$id\"[^>]*>([^>]*)<\/\\1>/", $in, $out);
Any suggestions?