Page 1 of 1
problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 12:59 pm
by rossati
Hello
I try to extract fragments between <> with :
Code: Select all
preg_match_all('/<.*?>/',$line,$arrm);
this works with Solmetra Regular Expression Test: the text
with
tell correctly
Code: Select all
Array
(
[0] => Array
(
[0] => <red>
[1] => <green>
[2] => <blue>
[3] => <magenta>
)
)
and suggest
Code: Select all
preg_match_all('/<.*?>/', '<red><green><blue><magenta>', $arr, PREG_PATTERN_ORDER);
The same code doesn't works in my PHP (5.2.0) and in 4.3.9.
best regards
Giovanni Rossati
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 1:44 pm
by Christopher
Perhaps this:
Code: Select all
preg_match_all('/[^\<\>]*/', '<red><green><blue><magenta>', $arr);
$arr = array_filter($arr[0], 'strlen');
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 2:13 pm
by prometheuzz
arborint wrote:Perhaps this:
Code: Select all
preg_match_all('/[^\<\>]*/', '<red><green><blue><magenta>', $arr);
$arr = array_filter($arr[0], 'strlen');
Note that there is not need to escape the '<' and '>':
and
are equivalent.
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 2:27 pm
by rossati
arborint wrote:Perhaps this:
Code: Select all
preg_match_all('/[^\<\>]*/', '<red><green><blue><magenta>', $arr);
$arr = array_filter($arr[0], 'strlen');
thanks
also this seem works:
Code: Select all
preg_match_all('/[^<>]+/', '<red><green><blue><magenta>', $arr);
but what I can't understand is why this
Code: Select all
preg_match_all('/\[.*?\]/', '[red][green][blue][magenta]', $arr, PREG_PATTERN_ORDER);
works ?
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 3:22 pm
by prometheuzz
The pattern in your original post works as well.
Code: Select all
// PHP 5.2.6 (cli) (built: Nov 11 2008 21:47:45)
// Copyright (c) 1997-2008 The PHP Group
// Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies
if(preg_match_all('/<.*?>/', '<red><green><blue><magenta>', $arr)) {
print_r($arr);
}
/* output:
Array
(
[0] => Array
(
[0] => <red>
[1] => <green>
[2] => <blue>
[3] => <magenta>
)
)
*/
If it doesn't, then there's something seriously messed up with the regex engine of your PHP installation.
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 3:47 pm
by Christopher
prometheuzz wrote:Note that there is not need to escape the '<' and '>':
Just habit, though I think a good one. I got tired of having one character need to be escaped and having to go back and idit, so I just escape every non-alphanumeric character that I am not using as a meta-character. There are so many common characters used meta-characters that I think being explicit is clearer.
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 4:56 pm
by prometheuzz
arborint wrote:prometheuzz wrote:Note that there is not need to escape the '<' and '>':
Just habit, though I think a good one.
It makes your regex overly verbose, IMO.
arborint wrote:I got tired of having one character need to be escaped and having to go back and idit, so I just escape every non-alphanumeric character that I am not using as a meta-character. There are so many common characters used meta-characters that I think being explicit is clearer.
Perhaps for someone not familiar with regex. But someone who
is familiar with them, will most likely disagree with you (as I do).
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 5:35 pm
by Christopher
prometheuzz wrote:Perhaps for someone not familiar with regex. But someone who is familiar with them, will most likely disagree with you (as I do).
Well I am someone familiar with regex, so all we know is that half of people familiar with regex disagree. Unless you are authorized to speak for all regex users (I hadn't been informed).

Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 5:45 pm
by prometheuzz
arborint wrote:prometheuzz wrote:Perhaps for someone not familiar with regex. But someone who is familiar with them, will most likely disagree with you (as I do).
Well I am someone familiar with regex, so all we know is that half of people familiar with regex disagree. Unless you are authorized to speak for all regex users (I hadn't been informed).

Hence the "most likely" in my response.
Anyway, perhaps I meant to say "more familiar".
; )
Really, I don't mean to say this to put you down or something, but if you truly escape all characters other than alpha-numerics, I really think you're over doing it. Especially inside a character class where most regex-meta characters don't have any special meaning to begin with.
Re: problem with preg_match_all and <> charecters
Posted: Fri Mar 27, 2009 9:57 pm
by Christopher
prometheuzz wrote:Really, I don't mean to say this to put you down or something, but if you truly escape all characters other than alpha-numerics, I really think you're over doing it. Especially inside a character class where most regex-meta characters don't have any special meaning to begin with.
Yeah, for regex pros it is over doing it -- even annoying to some. My comment was meant for the original poster because I have found that escaping symbols that you want to be literals reduces problems. It has been my experience that there are really very few people who truly understand regular expressions. Most of the questions here are from people who found an example somewhere and when they try to change it it explodes. So I thing the consistency helps.