problem when there are multi-match strings
Moderator: General Moderators
problem when there are multi-match strings
Hi I'm new in regex. I hope someone can help me out.
I have a string whch might include multi-substrings I want to search, like this:
$mystring = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here"></PLAYER>
some text here
<PLAYER TYPE="smplayer" TITLE="another title" ARTIST="some one else" URL="soncond link here"></PLAYER>
some text here';
I want to get the values of TYPE, TITLE, ARTIST and URL from each <PLAYER...></PLAYER> element, then do a replacement of the element. I can use looping to get match and raplce one by one so that I have the number of players and values for each. I use preg_match("/<PLAYER (.*)></PLAYER>/", $mystring, $matches), expecting to get the first pair of <PLAYER..></PLAYER>. But I didn't. The returned result was the string between the first "<PLAYER..>" and the last closing "</PLAYER>".
I don't know why. what I'm doing wrong?
I have a string whch might include multi-substrings I want to search, like this:
$mystring = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here"></PLAYER>
some text here
<PLAYER TYPE="smplayer" TITLE="another title" ARTIST="some one else" URL="soncond link here"></PLAYER>
some text here';
I want to get the values of TYPE, TITLE, ARTIST and URL from each <PLAYER...></PLAYER> element, then do a replacement of the element. I can use looping to get match and raplce one by one so that I have the number of players and values for each. I use preg_match("/<PLAYER (.*)></PLAYER>/", $mystring, $matches), expecting to get the first pair of <PLAYER..></PLAYER>. But I didn't. The returned result was the string between the first "<PLAYER..>" and the last closing "</PLAYER>".
I don't know why. what I'm doing wrong?
Re: problem when there are multi-match strings
patterns are usually what is known as greedy, which means they gobble up as much as possible when they try to match. you can turn this greediness off. you may also consider that inside your "(.*)", you do not want to capture just anything, but specifically you don't want anything after the next ">" it encounters.
[EDIT] @prometheuzz ... yes, greedy "quantifiers", brain couldn't find the word for some reason
[EDIT] @prometheuzz ... yes, greedy "quantifiers", brain couldn't find the word for some reason
Last edited by Popcorn on Tue Jun 16, 2009 3:13 am, edited 1 time in total.
- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: problem when there are multi-match strings
Something like this?
Code: Select all
$text = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here"></PLAYER>
some text here
<PLAYER TYPE="smplayer" TITLE="another title" ARTIST="some one else" URL="soncond link here"></PLAYER>
some text here';
preg_match_all('/<PLAYER(?=[^>]*(ARTIST="[^"]+"))(?=[^>]*(URL="[^"]+"))(?=[^>]*(TYPE="[^"]+"))/i',
$text, $matches, PREG_SET_ORDER);
print_r($matches);- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: problem when there are multi-match strings
You most probably meant it correct, but it's not the patterns that are greedy, it's the quantifiers (+, *, ? and {a,b} where 'a' and 'b' are numbers and 'a' >= 'b') that are greedy. Besides that small remark, the rest of your post is sound advice!Popcorn wrote:patterns are usually what is known as greedy, ...
Re: problem when there are multi-match strings
Thank you, Popcorn and prometheuzz, for the replies. They are very helpful.
@Popcorns
I found out adding a "?" after "*" will turn off the greediness.
@prometheuzz:
Your code is very useful. The printing out result is
Array ( [0] => Array ( [0] => ARTIST="some one" [2] => URL="a link here" [3] => TYPE="smplayer" ) [1] => Array ( [0] => ARTIST="some one else" [2] => URL="soncond link here" [3] => TYPE="smplayer" ) )
Any idea why the Array[0][1] is missing?
one more question, if this is an optional item:
I tried this:
It didn't work.
@Popcorns
I found out adding a "?" after "*" will turn off the greediness.
Code: Select all
preg_match("/<PLAYER (.*?)></PLAYER>/", $mystring, $matches);Your code is very useful. The printing out result is
Array ( [0] => Array ( [0] => ARTIST="some one" [2] => URL="a link here" [3] => TYPE="smplayer" ) [1] => Array ( [0] => ARTIST="some one else" [2] => URL="soncond link here" [3] => TYPE="smplayer" ) )
Any idea why the Array[0][1] is missing?
one more question, if this is an optional item:
Code: Select all
$text = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here" OPTION1="o1"></PLAYER>Code: Select all
preg_match('/<PLAYER(?=[^>]*(ARTIST="[^"]+"))(?=[^>]*(URL="[^"]+"))(?=[^>]*(TYPE="[^"]+"))(?=[^>]*(OPTION1="[^"]+")?)/i', $text, $matches);- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: problem when there are multi-match strings
No. When I runt that code, it produces the following output:topace wrote:...
@prometheuzz:
Your code is very useful. The printing out result is
Array ( [0] => Array ( [0] => ARTIST="some one" [2] => URL="a link here" [3] => TYPE="smplayer" ) [1] => Array ( [0] => ARTIST="some one else" [2] => URL="soncond link here" [3] => TYPE="smplayer" ) )
Any idea why the Array[0][1] is missing?
Code: Select all
Array
(
[0] => Array
(
[0] => <PLAYER
[1] => ARTIST="some one"
[2] => URL="a link here"
[3] => TYPE="smplayer"
)
[1] => Array
(
[0] => <PLAYER
[1] => ARTIST="some one else"
[2] => URL="soncond link here"
[3] => TYPE="smplayer"
)
)No, then my "trick" doesn't work. You will have to do something like t his:topace wrote:one more question, if this is an optional item:I tried this:Code: Select all
$text = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here" OPTION1="o1"></PLAYER>It didn't work.Code: Select all
preg_match('/<PLAYER(?=[^>]*(ARTIST="[^"]+"))(?=[^>]*(URL="[^"]+"))(?=[^>]*(TYPE="[^"]+"))(?=[^>]*(OPTION1="[^"]+")?)/i', $text, $matches);
Code: Select all
$text = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here" OPTION1="o1"></PLAYER>
<PLAYER TYPE="player" TITLE="title" ARTIST="one" URL="a link"></PLAYER>';
preg_match_all('/<PLAYER\s+(TYPE="[^"]+")\s+(TITLE="[^"]+")\s+(ARTIST="[^"]+")\s+(URL="[^"]+")\s*(OPTION1="[^"]+")?/i',
$text, $matches, PREG_SET_ORDER);
print_r($matches);But, it looks like you're parsing (s)html, have you considered using an html parser?
Re: problem when there are multi-match strings
what about a conditional?
Code: Select all
$text = '<PLAYER TYPE="smplayer" TITLE="some title" ARTIST="some one" URL="a link here"></PLAYER>
some text here
<PLAYER TYPE="smplayer" TITLE="another title" URL="soncond link here" ARTIST="some one else" OPTION1="o1"></PLAYER>
some text here';
preg_match_all('/<PLAYER(?=[^>]*(ARTIST="[^"]+"))(?=[^>]*(URL="[^"]+"))(?=[^>]*(TYPE="[^"]+"))(?(?=[^>]*OPTION1="[^"]+")(?=[^>]*(OPTION1="[^"]+"))|)/i', $text, $matches, PREG_SET_ORDER);Code: Select all
Array(
[0] => Array (
[0] => <PLAYER
[1] => ARTIST="some one"
[2] => URL="a link here"
[3] => TYPE="smplayer"
)
[1] => Array (
[0] => <PLAYER
[1] => ARTIST="some one else"
[2] => URL="soncond link here"
[3] => TYPE="smplayer"
[4] => OPTION1="o1"
)
)- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: problem when there are multi-match strings
Clever!Popcorn wrote:what about a conditional?
...