preg_match_all with "<" symbol

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
jaymoore_299
Forum Contributor
Posts: 128
Joined: Wed May 11, 2005 6:40 pm
Contact:

preg_match_all with "<" symbol

Post by jaymoore_299 »

Why doesn't this work?

Code: Select all

preg_match_all("|a\shref=|", $html, $matches, PREG_PATTERN_ORDER);
 
preg_match_all("|<a\shref=|", $html, $matches, PREG_PATTERN_ORDER);
 
When I just search for "a href=" I get results, but when I search for "<a href=" I get nothing.
User avatar
GeertDD
Forum Contributor
Posts: 274
Joined: Sun Oct 22, 2006 1:47 am
Location: Belgium

Re: preg_match_all with "<" symbol

Post by GeertDD »

"<" is not a special regex metacharacter, so it should just match the literal character. What is the contents of $html?
Darkzaelus
Forum Commoner
Posts: 94
Joined: Tue Sep 09, 2008 7:02 am

Re: preg_match_all with "<" symbol

Post by Darkzaelus »

It is a special character.
Replace it with \\<.

Cheers,

Darkzaelus
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: preg_match_all with "<" symbol

Post by prometheuzz »

Darkzaelus wrote:It is a special character.
Replace it with \\<.

Cheers,

Darkzaelus
You are wrong: < is NOT a regex meta character.
And even if it was a special character, only one escape would have been enough.

Run this snippet:

Code: Select all

$text = '<';
if(preg_match('/^<$/', $text)) {
  echo "Oi, it's not special after all!";
} 
else {
  echo "Well, well, it IS special...";
}
Darkzaelus
Forum Commoner
Posts: 94
Joined: Tue Sep 09, 2008 7:02 am

Re: preg_match_all with "<" symbol

Post by Darkzaelus »

Sorry, was following the jack daniels cheat sheet :P Doesn't seem to do any harm on my system, so i'll leave it as it is.

Cheers,
Darkzaelus.
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: preg_match_all with "<" symbol

Post by prometheuzz »

Darkzaelus wrote:Sorry, was following the jack daniels cheat sheet :P
No problem.
Darkzaelus wrote:Doesn't seem to do any harm on my system, so i'll leave it as it is.
Err, what do you meant by "it doesn't do any harm"? Are you saying the regex '/^\>$/' or '/^\\>$/' sucsesfully matches the string '>'? I really doubt it. And even if it did, would you keep the incorrect syntax only because "it doesn't do any harm on YOUR system"? That is asking for trouble, IMO!
Last edited by prometheuzz on Sat Sep 27, 2008 4:02 pm, edited 1 time in total.
User avatar
GeertDD
Forum Contributor
Posts: 274
Joined: Sun Oct 22, 2006 1:47 am
Location: Belgium

Re: preg_match_all with "<" symbol

Post by GeertDD »

Why don't you just listen to prometheuzz and me, huh?

< and > alone are NOT any kind of regex metacharacter, so do NOT escape them. That cheatsheet you are talking about is WRONG.

If you do escape them, you run the risk of turning them into a metacharacter after all! Some regex flavors (e.g. GNU but not PCRE) interpret \< and \> as word boundaries. You don't want that to happen in your case. So do not escape them.
Darkzaelus
Forum Commoner
Posts: 94
Joined: Tue Sep 09, 2008 7:02 am

Re: preg_match_all with "<" symbol

Post by Darkzaelus »

Sorry i haven't replied, school work.

Checked the little script you sent me, it isn't special after all, my fault.

Wonder which languages have it as a metacharacter? (Cheat sheet was from addedbytes.com)

Cheers,


Darkzaelus
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: preg_match_all with "<" symbol

Post by prometheuzz »

Darkzaelus wrote:Sorry i haven't replied, school work.
No problem.
Darkzaelus wrote:Checked the little script you sent me, it isn't special after all, my fault.
Yes, we know.
;)
Darkzaelus wrote:Wonder which languages have it as a metacharacter?
Geert mentioned a case when it might be interpreted as a meta character.
Darkzaelus wrote:(Cheat sheet was from addedbytes.com)
...
Time to remove addedbytes.com from your bookmarks!
;)
Post Reply