Page 1 of 1

preg_match returns two different results for same string

Posted: Wed Jul 06, 2011 7:18 am
by mrgrammar
I want to make sure that my form is not vulnerable to injection. In the process of doing so, I found the following.

If I do a preg_match on the data directly submitted to the form, it returns true.

Code: Select all

// The form input comes from a text box that has two carriage returns
$data = $_POST['forminput'];
$checkdata = preg_match("/[\r\n]/",$value);
// output: $checkdata = true
However, if I treat the data first and then do a preg_match, it returns false.

Code: Select all

// The form input comes from a text box that has two carriage returns
$data = $_POST[forminput];
$data = strip_tags(htmlspecialchars(mysql_real_escape_string($data)));
$checkdata = preg_match("/[\r\n]/",$value);
// output: $checkdata = false
In both examples, if I echo the form input, the string on the screen shows the /r/n. Why does the first example return true and the second return false?

Re: preg_match returns two different results for same string

Posted: Wed Jul 06, 2011 9:54 am
by social_experiment

Code: Select all

<?php
$data = strip_tags(htmlspecialchars(mysql_real_escape_string($data)));
?>
I think it would be different because you are changing the composition of the data you receive by subjecting that value to htmlspecialchars() / strip_tags() and mysql_real_escape_string(). If you view the source code of the page, what does that look like?

Re: preg_match returns two different results for same string

Posted: Wed Jul 06, 2011 10:47 am
by AbraCadaver
Well, you are checking $value instead of $data so it would depend on what $value is.

Re: preg_match returns two different results for same string

Posted: Wed Jul 06, 2011 5:16 pm
by McInfo
This isn't an answer to the question that was asked (I believe AbraCadaver's answer is correct), but it is an answer to the "Why doesn't strip_tags() work?" question I foresee coming next.

These function calls are out of order:

Code: Select all

$data = strip_tags(htmlspecialchars(mysql_real_escape_string($data)));
mysql_real_escape_string() should be the last modification made to an input string before it goes into the query string.

See the difference:

Code: Select all

<?php
# strip_tags(htmlspecialchars(mysql_real_escape_string()));
var_dump(
    $str = "<span>\r\n&\r\n</span>",
    # string(18) "<span>
     # &
     # </span>"
     $str = mysql_real_escape_string($str),
    # string(21) "<span>\r\n&\r\n</span>"
     $str = htmlspecialchars($str),
    # string(38) "<span>\r\n&\r\n</span>"
     $str = strip_tags($str)
    # string(38) "<span>\r\n&\r\n</span>"
);

# mysql_real_escape_string(htmlspecialchars(strip_tags()))
var_dump(
    $str = "<span>\r\n&\r\n</span>",
    # string(18) "<span>
     # &
     # </span>"
     $str = strip_tags($str),
    # string(5) "
     # &
     # "
     $str = htmlspecialchars($str),
    # string(9) "
     # &
     # "
     $str = mysql_real_escape_string($str)
    # string(13) "\r\n&\r\n"
);
Save htmlspecialchars() for encoding strings after they are read from the database but before they are sent to output.

Code: Select all

// Pseudo-code
$sanitized_input = mysql_real_escape_string(strip_tags($raw_input));
database_insert($sanitized_input);
$decoded_output = database_select();
$encoded_output = htmlspecialchars($decoded_output);
output($encoded_output);

Re: preg_match returns two different results for same string

Posted: Fri Jul 08, 2011 9:43 am
by Mordred
I support McInfo's answer, with the important addition that htmlspecialchars() needs to be called with ENT_QUOTES and proper encoding.