PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Thu Nov 23, 2017 1:45 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 8 posts ] 
Author Message
PostPosted: Sat May 16, 2015 3:53 am 
Offline
Forum Newbie

Joined: Sat May 16, 2015 3:49 am
Posts: 5
Hi,

Can you please advice me on following regular expressions

1)(green|blue)?+.+
i understood that it will take either green or blue which is optional but +.+
changes its meaning and it accepts colors which are not green and blue
but i dont understand how is it affecting

2)^([\"']?)\\d\\d:\\d\\d\\1,([\"']?)[A-Z]\\w+\\2,.*$"
It accepts
10:23,Added,Queue,7432e01
10:53,"Removed","Queue","7432e01"
10:23,Added,,queue 2,7432e01
i believe backreferences are used here then they shd be using
value in capturing group only , why are these two value accepted -> Added,Queue.
Also if you check the third line, if i dont give any value then that is also accepted
Please guide

Thanks
Ruchi


Top
 Profile  
 
PostPosted: Sat May 16, 2015 6:24 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
1) Adding a second quantifier changes how the repetition works. Adding a + means that the engine will not backtrack through it when trying to match the rest of the regex.
Code:
(green|blue)?+

That will match "green" or "blue" optionally, but if it does match and the rest of the regex cannot match, then the entire regex will fail. For example,
Syntax: [ Download ] [ Hide ]
var_dump(preg_match('/(green|blue)?+green/', 'green')); // int(0)

Normally the regex would match the "green" in the first part, fail to match the "green" in the second part, backtrack so that the first part does not match (since it was optional), and successfully match the second "green". Adding the + means that as soon as it reaches the backtracking part the regex fails.

.+ works normally: there must be one or more characters. All together the regex is... well, it's unnecessarily complicated. There's three cases of input:
a) The string contains "greenX", which will match with $0=greenX and $1=green
b) Same with blueX: $0=blueX, $1=blue
c) If the string doesn't contain either then it will still be matched because of the .+

Adding the + is basically for performance so don't worry about it too much. Actually adding it is more likely to break a regex than help it because backtracking plays a significant role in how regexes are generally used.

2) Your regex only really matches against the "10:23,Added,". What's happening is that final .* is matching the rest of the line. Try removing it (and the $ with it).


Top
 Profile  
 
PostPosted: Sat May 16, 2015 7:59 am 
Offline
Forum Newbie

Joined: Sat May 16, 2015 3:49 am
Posts: 5
2) I got it ...thanks :)

1) For RegEx (green|blue)?+.+
and String value : green
Match fails
and String value : red
Match passes
So actually I didnt get your answer or maybe i have not posted my question properly

Thank you so much
Ruchi


Top
 Profile  
 
PostPosted: Sat May 16, 2015 8:03 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
Two parts
1. (green|blue)?+
2. .+

"green" matches the first part but since there isn't anything after it does not match the second part. Because of the ?+ the engine will not backtrack to undo the first match (which was optional) so that it can make the second instead. If you change it to just ? then it would backtrack and the string would match.
"red" does not match the optional first part but does match the second part.


Top
 Profile  
 
PostPosted: Sat May 16, 2015 9:36 am 
Offline
Forum Newbie

Joined: Sat May 16, 2015 3:49 am
Posts: 5
okay got it...thank u so much :)


Top
 Profile  
 
PostPosted: Sun May 17, 2015 7:03 am 
Offline
Forum Newbie

Joined: Sat May 16, 2015 3:49 am
Posts: 5
Hi,

You have said that backtracking plays a significant role in how regexes are generally used.

I am not aware of these guidelines...can u guide me any link/site which is a good reference point to understand that

Thanks


Top
 Profile  
 
PostPosted: Sun May 17, 2015 7:20 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
Not really - backtracking is one of those things most people rely upon without realizing it. Which is why things like possessive quantifiers (?+, *+, ++) or their nicer relative once-only subpatterns (?>...) can easily break someone's regex.

If you want to learn, regular-expressions.info is a good place to start. The documentation for Perl's perlre is another place to get more technical information, and of course there's PHP's own PCRE documentation.


Top
 Profile  
 
PostPosted: Sun May 17, 2015 7:48 am 
Offline
Forum Newbie

Joined: Sat May 16, 2015 3:49 am
Posts: 5
okay...will go thru these sites...thanks


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group