Improvements to my pattern?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
HCBen
Forum Commoner
Posts: 33
Joined: Thu Jun 22, 2006 3:15 pm
Location: Indiana

Improvements to my pattern?

Post by HCBen »

Anyone see a way that I could streamline this pattern, or if there are any issues I need to be aware of:

Code: Select all

'/<[\w]+\s*[a-z_0-9;\-=:"\'\s]*[id|class]=["\']?f:([\w]+)["\']?\s*[a-z_0-9;\-=:"\'\s]*>(.+?)<\/[\w]+>/xius'
I'm using it to match any html tag with an id value or class name prefaced with "f:" to obtain the text within the open/close tags - and I needed to allow for any additional tag attributes.

It works fine and I haven't run into any problems so far. Just thought I'd throw it out there and see what others think.

Thanks,
Ben
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Improvements to my pattern?

Post by John Cartwright »

I dont know if mine turned out much better,

Code: Select all

$foo = '<textarea type="foo" id="f:foobarington" style=2"foo">input 1</textarea>
 
<textarea type="fee" class="f:feebarington">input 2</textarea>';
 
preg_match_all('#<\w+.*?[id|class]=["\']f:([^"\']+)["\'].*?>(.*?)<[^>]+>#is', $foo, $matches);
 
echo '<pre>';
print_r($matches);

Code: Select all

Array
(
    ....
 
    [1] => Array
        (
            [0] => foobarington
            [1] => feebarington
        )
 
    [2] => Array
        (
            [0] => input 1
            [1] => input 2
        )
 
)
//step in regex guru
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

Re: Improvements to my pattern?

Post by jmut »

Just be very careful if allowing users to add attributes.

e.g this will execute the javascript in IE, despite the fact it's part of style sheet. Don't ask why, they just thought it's cool I guess.

<div style="background:url('javascript:alert(1)')">
</div>
User avatar
HCBen
Forum Commoner
Posts: 33
Joined: Thu Jun 22, 2006 3:15 pm
Location: Indiana

Re: Improvements to my pattern?

Post by HCBen »

Jcart - Your's is better. I changed it slightly and had to allow for class/id's without quotes (as I don't have full control of the :( ) :

Code: Select all

#<\w+[^>]*(?:id|class)=["\']?f:(\w+)["\']?[^>]*>(.+?)<[^>]+>#
Also, I replaced .*? with [^>]* because it's safer, I believe. Other than that I'm not sure how much more can be done to improve it now...

It works great!

Thanks,
Ben
Post Reply