what is this doing?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

what is this doing?

Post by s.dot »

So I'm wanting to replace anything between <style></style> tags.. with nothing. So very stupidly (;D) i came up with this regex.

Code: Select all

preg_replace('|<style(.+)/style>|i', '', $description);
Can anyone think of a case where this wouldn't match style tags?
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

  1. if they are on seperate lines
  2. if they have whitespace between the < or > or / and the tag name
  3. if you have multiple in the page all data between the outside tags will be removed as well
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Code: Select all

preg_replace('|<(\s*)style(.+)/(\s*)style(\s*)>|i', '', $string);
So that pretty much takes care of the whitespace?

Now on different lines.. should it matter? the .(period) matches any character.. does this include new lines?

Don't know what you meant by #3.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

scrotaye wrote:

Code: Select all

preg_replace('|<(\s*)style(.+)/(\s*)style(\s*)>|i', '', $string);
So that pretty much takes care of the whitespace?
basically, yes.
scrotaye wrote:Now on different lines.. should it matter? the .(period) matches any character.. does this include new lines?
The dot will not match new lines, because that is a separate setting.
scrotaye wrote:Don't know what you meant by #3.
After you fix the single line issue you'll encounter this:

Code: Select all

<style>
.someClass { margin-top: 3px; }
</style>
<button>hi</button>
<style>
.someOtherClass { margin-top: -3px; }
</style>
All of that will be wiped out.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Code: Select all

preg_replace('|<style(.+)/style>|ism', '', $string);
So, reading d11wtq's tutorial adding the s after the delimiter ignores whitespace.. good.. that takes care of the spaces also..? And m is multi-line mode.. so that takes care of the ones on separate lines. Right?

I need to develop an environment to test this in.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

scrotaye wrote:I need to develop an environment to test this in.
That would be a good idea...
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

I just did that quickly. Wrote a page that echos out given $string then echos $string ran through the regex =)

I have #1 and #2 taken care of. Now for the multiple matches thing..

I tried:

Code: Select all

preg_replace("|(^<style(.+)/style>$)|ism","",$string);
heh.. didn't exactly work out too well. could you send me in the right direction ;)
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

hint: ungreedy
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

I know I need to change the (.+) to match anything except </style>

I think...
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

I get the hint ungreedy as in the .+ is being greedy and matching too much stuff ;) but perhaps another hint?
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

there are 5 threads in the regex board that use the term ungreedy in them. 1 is this thread... the other 4 have explanations about it in them.. ;)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

lmao dammit feyd, you could've just told me to add a ? after the .+

But then again, what fun is that? That particular problem is solved :)

Unless you can think of another instance where

Code: Select all

preg_replace('|<style(.+?)/style>|ism', '', $string);
wouldn't match a given string.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

send a bunch of variants at it.. if they all pass, you may be done. :)
pilau
Forum Regular
Posts: 594
Joined: Sat Jul 09, 2005 10:22 am
Location: Israel

Post by pilau »

And make a note, scrotaye: this would be silly to test:

Code: Select all

<style>
<style>
</style>
</style>
:lol:
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

that wouldn't be bad to test ;) I would expect it to just leave the last </style> tag.. in which case strip_tags would get rid of that :)
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
Post Reply