regex with pipes?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

regex with pipes?

Post by Eric! »

I ran across some old code using regex like this for allowing alphanumeric with dot, dash and underscores:

Code: Select all

preg_match('|^[0-9.a-zA-Z_-]*$|', $value)
I was surprised to find it actually seemed to work, but I don't know why. The period isn't escaped and what's up with the pipes? I've only seen patterns done like /pattern/. I've found some cases where a similar regex from the same coder fails which makes me suspect they weren't properly tested. For example:

Code: Select all

preg_match('|[a-zA-Z]|', $value)
Seems to pass anything that has at least letter in it, but it is missing the string start ^ and $ so I would assume that it looks for one passing condition on any character and then returns a boolean result. The programmer was using this to validate alpha character strings, which obviously isn't correct, so I'm suspicious about all of the regex patterns.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: regex with pipes?

Post by requinix »

The delimiters, the slashes you're used to, don't actually have to be slashes. They can be pretty much any character you want - just make sure you have one at the beginning and at the end before any flags, and that you escape any uses of it inside the expression. / and # are most common or popular, but I've seen ! ~ | used too.

As for the period, the rules inside character sets change a bit: many metacharacters lose their special meaning. Like . + * ( ) { } $ all become just regular literal characters while ^ and - gain new/different meanings. So while you could escape that period if you wanted to, it's really not necessary because there's nothing to "escape".

Side note: if you want to validate a string containing only letters, ctype_alpha is better.
User avatar
ragax
Forum Commoner
Posts: 85
Joined: Thu Dec 15, 2011 1:40 pm
Location: Nelson, NZ

Re: regex with pipes?

Post by ragax »

Couldn't have said it better than requinix!

Other delimiters I like are tildes and commas --- but I may be alone on that one.
I'll add that the forward slash is one of the worst you could choose as a standard delimiter because sooner or later you'll want to match urls, which will give you this kind of soup:
$pattern = '/http:\/\/www.you.com\/pics\//';

In your character class, note that you have all the elements of \w: 0-9, a-z, A-Z, and underscore.
So you could streamline the pattern to '|^[-.\w]*$|'
Err, I meant, ',^[-.\w]*$,'
:wink:
Eric!
DevNet Resident
Posts: 1146
Joined: Sun Jun 14, 2009 3:13 pm

Re: regex with pipes?

Post by Eric! »

Thanks. I couldn't find any specs on the delimiters (at least on the php side of the documentation) and I didn't know that the character rules changed. Like I said this is some old code and I found a bug with the alpha filter and when I was looking deeper I saw these other odd regex patterns. Now I know. :)
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: regex with pipes?

Post by requinix »

ragax wrote:Other delimiters I like are tildes and commas --- but I may be alone on that one.
Really? Commas? Commas? Yeah... :crazy:

On the subject of picking delimiters, besides avoiding characters you're using in the expression (because let's be honest: backslashes look pretty ugly) I'd suggest avoiding metacharacters too. Like pipes. People accustomed to reading regular expressions will see the pipes and do a double-take because their first thought will be "oh, alternation... but wait that doesn't fit...".

And the documentation? Delimiters. But like with other complicated subjects, what you'll find on php.net is best suited just for those quick questions like "what's this symbol mean" and "what's the syntax for a negative lookbehind". Like Wikipedia: a good starting point and it definitely fills a niche, but if you want to truly learn about a subject then consider looking somewhere else for more in-depth explanations and guidance. Our sticky mentions a few things; regular-expressions.info is another good place.
User avatar
ragax
Forum Commoner
Posts: 85
Joined: Thu Dec 15, 2011 1:40 pm
Location: Nelson, NZ

Re: regex with pipes?

Post by ragax »

Yes, commas are kind of crazy, that's why I like to use them sometimes.
For everyday use, though, tildes: less risky, more elegant.
consider looking somewhere else for more in-depth explanations and guidance. Our sticky mentions a few things; regular-expressions.info is another good place.
But for more advanced stuff, RexEgg.com is really the best. Hell, I ought to think so, I made it. :wink:
But really, it does go into a number of features that regular-expressions.info doesn't cover. (At least not yet. The main reason it doesn't cover them is that these features are not yet implemented in RegexBuddy, by the same author. But that's coming in RB4.)

And if you're looking for really in-depth information about PCRE (which php regex functions are built on), then the PCRE manual is pretty great.

After that, if you're still hungry for more... Just ask Requinix. :D
Post Reply