Holes in preg_match filters

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

matthijs
DevNet Master
Posts: 3360
Joined: Thu Oct 06, 2005 3:57 pm

Holes in preg_match filters

Post by matthijs »

Did you guys read Stefan Essers post Holes in preg_match filters?

Don't know about you, but this is new for me and I'll have to do some research to see how this affects my own scripts. Many regexes are affected by this (even examples in Shifletts security book).
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Heh, I usually trim() (or rtrim()) all incoming lines anyways. :)
User avatar
Oren
DevNet Resident
Posts: 1640
Joined: Fri Apr 07, 2006 5:13 am
Location: Israel

Post by Oren »

I don't see why you call it "Holes in preg_match filters" :?
It's not a hole at all, this is how PHP parse such regex. Althought I knew that from the day I learnt regex's, I've never used the D modifier :P
matthijs
DevNet Master
Posts: 3360
Joined: Thu Oct 06, 2005 3:57 pm

Post by matthijs »

It's not my words. Stefan Esser calls it that in his post.

I think what Stefan means is that there are holes in the filters, built using preg_match. Of course it's a human fault when a regex is not done correctly, but if almost every tutorial out there and even an example in Chris Shifletts book explains it wrong (as far as I can see/understand), then many people learn and apply it wrong.

Security is not easy ...
User avatar
Oren
DevNet Resident
Posts: 1640
Joined: Fri Apr 07, 2006 5:13 am
Location: Israel

Post by Oren »

matthijs wrote:Security is not easy ...
Certainly not.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

The vulnerability described would certainly be tough to exploit. While Esser correctly points out that in things like headers, newlines are important, you would need something that looked like:

Code: Select all

// php 4 example, I know this was fixed in php5
$location = validate($location, 'url');
$extra_variable = validate($extra_variable, 'something');
header("Location: $location $extra_variable");
Maybe it'd be more likely in mail.

Regardless, it does raise some concerns with regards to data consistency. If your application expects input to have only those set characters, newlines could cause problems. I think Feyd's method of trim()'ing data is the best.

There are worse things out there, however, that result from lack of user education. Example:

Code: Select all

$data = false;
if (isset($_GET['data']) && $_GET['data'] == 2) {
  $data = $_GET['data'];
}
Which would be vulnerable to XSS via ?data=2<script>alert('xss');</script> due to PHP's typecasting.

(Yes, I know, you should be casting $data to int).
User avatar
veridicus
Forum Commoner
Posts: 86
Joined: Fri Feb 23, 2007 9:16 am

Post by veridicus »

The issue is with an understanding of security issues overall. With a solid security education it's more obvious what's risky in regex and any other code. I don't think this regex "hole" is a huge problem.
User avatar
shiflett
Forum Contributor
Posts: 124
Joined: Sun Feb 06, 2005 11:22 am

Post by shiflett »

Hi matthijs,

If you're aware of any errors in my book that are not listed in the errata, please let me know. I want to make sure it remains current. (To date, I'm only aware of typographical errors.) Not only do I strive to provide accurate information in the book, I want to aggressively highlight any errors that are discovered. Also, I want to immediately correct any errors in the code repository, if they exist.

My examples tend to use alternatives to regular expressions in order to focus on the practice and not the pattern. This is because there are people who can better explain regular expressions, and there are entire books on the subject.

I must admit that this particular "hole" isn't very interesting to me. The concern is the ability to include a single newline after the subject string. Nothing else. Ambush Commander points out a much more dangerous and realistic problem. Stuff like that scares me, because it's easy to see how someone (myself included) could make the same mistake.
User avatar
dreamscape
Forum Commoner
Posts: 87
Joined: Wed Jun 08, 2005 10:06 am
Contact:

Post by dreamscape »

shiflett wrote:I must admit that this particular "hole" isn't very interesting to me. The concern is the ability to include a single newline after the subject string. Nothing else.
I could not agree more. Sometimes I think Stefan Essers likes to complain about things just for the sake of complaining, especially since his falling out with the PHP dev team.

Even in the exploits he cites where a new line character *can be* dangerous, an extraneous new line at the end of the subject cannot be dangerous unless you are doing something really off the wall, because anything after the newline, which would be required to carry out exploits like that, would cause the PCRE expression to fail.
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

Which would be vulnerable to XSS via ?data=2<script>alert('xss');</script> due to PHP's typecasting.
omg why did that never occur to me before?!
tuff like that scares me, because it's easy to see how someone (myself included) could make the same mistake.
$numScared++;
matthijs
DevNet Master
Posts: 3360
Joined: Thu Oct 06, 2005 3:57 pm

Post by matthijs »

Chris: thanks for your response. My wording was probably a bit off. I was just assuming that what Stefan pointed out was indeed a real and maybe even a big vulnerability.

After reading Ambush and your comments and looking at it a bit closer, it seems indeed to be blown out of proportion a bit. I didn't do a thorough test myself, I just wanted to know other peoples opinion about his post.

feyds suggestion of using trim() in situations like this seems like a good one. Or adding the /D modifier of course.

The fact that it's "only" a single new line character which could be injected didn't get to me immediately. I had visions of long strings of javascript being injected :wink:
So if I test it like this:

Code: Select all

$clean = array();
   if (preg_match("/^[0-9a-zA-Z]+$/", $_GET['var'])) {
      $clean['var'] = $_GET['var'];
   }
Then if someone GETs

Code: Select all

// pregmatchtest.php?var=TEST%0a
$clean['var'] is now 'TEST ' (newline at the end)
Then if I would somehow concatenate the clean['var'] with another var, I'd end up with:

Code: Select all

$new_var = $clean['var'] . other_var;
// now $new_var is:
TEST
other_var
Which could in some situations be unwanted. (basically the same example as the one from Ambush.)
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

not quite matthijs. Anything appearing after the linefeed will get through.
User avatar
shiflett
Forum Contributor
Posts: 124
Joined: Sun Feb 06, 2005 11:22 am

Post by shiflett »

ole wrote:Anything appearing after the linefeed will get through.
That's not true. Did you try it?
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

shiflett wrote:
ole wrote:Anything appearing after the linefeed will get through.
That's not true. Did you try it?
We appear to have lost your avatar Chris :? :oops:
User avatar
Ollie Saunders
DevNet Master
Posts: 3179
Joined: Tue May 24, 2005 6:01 pm
Location: UK

Post by Ollie Saunders »

shiflett wrote:
ole wrote:Anything appearing after the linefeed will get through.
That's not true. Did you try it?
Have now. I stand corrected
Post Reply