Holes in preg_match filters
Moderator: General Moderators
Holes in preg_match filters
Did you guys read Stefan Essers post Holes in preg_match filters?
Don't know about you, but this is new for me and I'll have to do some research to see how this affects my own scripts. Many regexes are affected by this (even examples in Shifletts security book).
Don't know about you, but this is new for me and I'll have to do some research to see how this affects my own scripts. Many regexes are affected by this (even examples in Shifletts security book).
It's not my words. Stefan Esser calls it that in his post.
I think what Stefan means is that there are holes in the filters, built using preg_match. Of course it's a human fault when a regex is not done correctly, but if almost every tutorial out there and even an example in Chris Shifletts book explains it wrong (as far as I can see/understand), then many people learn and apply it wrong.
Security is not easy ...
I think what Stefan means is that there are holes in the filters, built using preg_match. Of course it's a human fault when a regex is not done correctly, but if almost every tutorial out there and even an example in Chris Shifletts book explains it wrong (as far as I can see/understand), then many people learn and apply it wrong.
Security is not easy ...
- Ambush Commander
- DevNet Master
- Posts: 3698
- Joined: Mon Oct 25, 2004 9:29 pm
- Location: New Jersey, US
The vulnerability described would certainly be tough to exploit. While Esser correctly points out that in things like headers, newlines are important, you would need something that looked like:
Maybe it'd be more likely in mail.
Regardless, it does raise some concerns with regards to data consistency. If your application expects input to have only those set characters, newlines could cause problems. I think Feyd's method of trim()'ing data is the best.
There are worse things out there, however, that result from lack of user education. Example:
Which would be vulnerable to XSS via ?data=2<script>alert('xss');</script> due to PHP's typecasting.
(Yes, I know, you should be casting $data to int).
Code: Select all
// php 4 example, I know this was fixed in php5
$location = validate($location, 'url');
$extra_variable = validate($extra_variable, 'something');
header("Location: $location $extra_variable");Regardless, it does raise some concerns with regards to data consistency. If your application expects input to have only those set characters, newlines could cause problems. I think Feyd's method of trim()'ing data is the best.
There are worse things out there, however, that result from lack of user education. Example:
Code: Select all
$data = false;
if (isset($_GET['data']) && $_GET['data'] == 2) {
$data = $_GET['data'];
}(Yes, I know, you should be casting $data to int).
Hi matthijs,
If you're aware of any errors in my book that are not listed in the errata, please let me know. I want to make sure it remains current. (To date, I'm only aware of typographical errors.) Not only do I strive to provide accurate information in the book, I want to aggressively highlight any errors that are discovered. Also, I want to immediately correct any errors in the code repository, if they exist.
My examples tend to use alternatives to regular expressions in order to focus on the practice and not the pattern. This is because there are people who can better explain regular expressions, and there are entire books on the subject.
I must admit that this particular "hole" isn't very interesting to me. The concern is the ability to include a single newline after the subject string. Nothing else. Ambush Commander points out a much more dangerous and realistic problem. Stuff like that scares me, because it's easy to see how someone (myself included) could make the same mistake.
If you're aware of any errors in my book that are not listed in the errata, please let me know. I want to make sure it remains current. (To date, I'm only aware of typographical errors.) Not only do I strive to provide accurate information in the book, I want to aggressively highlight any errors that are discovered. Also, I want to immediately correct any errors in the code repository, if they exist.
My examples tend to use alternatives to regular expressions in order to focus on the practice and not the pattern. This is because there are people who can better explain regular expressions, and there are entire books on the subject.
I must admit that this particular "hole" isn't very interesting to me. The concern is the ability to include a single newline after the subject string. Nothing else. Ambush Commander points out a much more dangerous and realistic problem. Stuff like that scares me, because it's easy to see how someone (myself included) could make the same mistake.
- dreamscape
- Forum Commoner
- Posts: 87
- Joined: Wed Jun 08, 2005 10:06 am
- Contact:
I could not agree more. Sometimes I think Stefan Essers likes to complain about things just for the sake of complaining, especially since his falling out with the PHP dev team.shiflett wrote:I must admit that this particular "hole" isn't very interesting to me. The concern is the ability to include a single newline after the subject string. Nothing else.
Even in the exploits he cites where a new line character *can be* dangerous, an extraneous new line at the end of the subject cannot be dangerous unless you are doing something really off the wall, because anything after the newline, which would be required to carry out exploits like that, would cause the PCRE expression to fail.
- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK
Chris: thanks for your response. My wording was probably a bit off. I was just assuming that what Stefan pointed out was indeed a real and maybe even a big vulnerability.
After reading Ambush and your comments and looking at it a bit closer, it seems indeed to be blown out of proportion a bit. I didn't do a thorough test myself, I just wanted to know other peoples opinion about his post.
feyds suggestion of using trim() in situations like this seems like a good one. Or adding the /D modifier of course.
The fact that it's "only" a single new line character which could be injected didn't get to me immediately. I had visions of long strings of javascript being injected
So if I test it like this:
Then if someone GETs
$clean['var'] is now 'TEST ' (newline at the end)
Then if I would somehow concatenate the clean['var'] with another var, I'd end up with:
Which could in some situations be unwanted. (basically the same example as the one from Ambush.)
After reading Ambush and your comments and looking at it a bit closer, it seems indeed to be blown out of proportion a bit. I didn't do a thorough test myself, I just wanted to know other peoples opinion about his post.
feyds suggestion of using trim() in situations like this seems like a good one. Or adding the /D modifier of course.
The fact that it's "only" a single new line character which could be injected didn't get to me immediately. I had visions of long strings of javascript being injected
So if I test it like this:
Code: Select all
$clean = array();
if (preg_match("/^[0-9a-zA-Z]+$/", $_GET['var'])) {
$clean['var'] = $_GET['var'];
}Code: Select all
// pregmatchtest.php?var=TEST%0aThen if I would somehow concatenate the clean['var'] with another var, I'd end up with:
Code: Select all
$new_var = $clean['var'] . other_var;
// now $new_var is:
TEST
other_var- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
- Ollie Saunders
- DevNet Master
- Posts: 3179
- Joined: Tue May 24, 2005 6:01 pm
- Location: UK