Filtering contact form input

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

Post Reply
gr8dane
Forum Newbie
Posts: 19
Joined: Wed Aug 22, 2007 3:12 am

Filtering contact form input

Post by gr8dane »

Processing forms was so much easier before I started worrying about security!! I'm seeing a lot of warnings that I should filter input, but not a lot of details about how to do that. Since I don't know what the "bad" input might be, I'm not sure how to filter it out.

For example, what is the best regex pattern to use to validate legitimate names without causing security problems in an email form? When possible, I want to allow perfectly legitimate names such as "O'Reilly", "Mary & Joseph", "Mary Smith-Jones", "Tom Jones, Jr." or 'Tom ("Bud") Jones'. I'm thinking of using /^[a-zA-Z\'\ \&\-\,\.\"\(\)]+$/ to validate the name. Would this cause any security issues when using the name in mail()?

Also, what kinds of input should I be filtering out of the message? Since I don't know what the expected input would be in that field, I would need to know what not to allow. Any good regex for that?
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

I've heard there's something called HTML Purifier.... might be what you're looking for?
gr8dane
Forum Newbie
Posts: 19
Joined: Wed Aug 22, 2007 3:12 am

Post by gr8dane »

I have no idea! It looks to me like it validates HTML, which is not what I'm looking for. Their website doesn't really make it clear (at least not that I could find) what exactly it does or how to use it. Besides, I wasn't really looking for software to do the job for me.
User avatar
aaronhall
DevNet Resident
Posts: 1040
Joined: Tue Aug 13, 2002 5:10 pm
Location: Back in Phoenix, missing the microbrews
Contact:

Post by aaronhall »

I wouldn't worry about filtering someone's given name... who are you to say what a valid name is?

If you plan on inserting user input into MySQL, sanitize with mysql_real_escape_string(). If you want to output user input onto one of your pages, and that input is not supposed to contain HTML, use htmlspecialchars(). If you want to output HTML, use HTML Purifier.
gr8dane
Forum Newbie
Posts: 19
Joined: Wed Aug 22, 2007 3:12 am

Post by gr8dane »

aaronhall wrote:I wouldn't worry about filtering someone's given name... who are you to say what a valid name is?
I thought this was the Security forum. Check out Kieran Huggins' post at viewtopic.php?t=72721 for a great illustration (literally) of why. We can't assume that the input in a name field is actually a name, and not malicious code. I've at least learned that much, so far.
If you plan on inserting user input into MySQL, sanitize with mysql_real_escape_string(). If you want to output user input onto one of your pages, and that input is not supposed to contain HTML, use htmlspecialchars().
My question was actually about using the input in an email, where HTML entities don't get translated.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

You can validate or you can filter:

Code: Select all

$name = preg_replace('/[ ^a-zA-Z\'\ \&\-\,\.\"\(\)]/', '', $_POST['name']);
I tend to filter and escape for security and validate for correctness, but you can do it either way.

You always need to use the appropriate escape function to escape the output.
(#10850)
gr8dane
Forum Newbie
Posts: 19
Joined: Wed Aug 22, 2007 3:12 am

Post by gr8dane »

Sorry, I guess I said "validate" when I meant "filter" in one spot. I'm not sure what the difference is. In any case, that brings me back to my original question: Would using that regex to filter a name cause any security issues when using the name in mail()? In other words, am I being too generous in what I allow? And what is "the appropriate escape function" when input is being emailed?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

The filter I showed above is a character whitelist. That is one thing to do and it can reduce the amount of validation you need to follow up with. You still want to validate that text you are getting is in the format expected. Escaping depends on what it is and where it is going. You would not want to escape email addresses, for example, because they must contain all valid characters, but you would escape the subject or body of the email.
(#10850)
gr8dane
Forum Newbie
Posts: 19
Joined: Wed Aug 22, 2007 3:12 am

Post by gr8dane »

arborint wrote:you would escape the subject or body of the email.
I tried addslashes() on the body, but the slashes showed up in the email. Was I doing something wrong?
User avatar
shiflett
Forum Contributor
Posts: 124
Joined: Sun Feb 06, 2005 11:22 am

Post by shiflett »

I said "validate" when I meant "filter" in one spot. I'm not sure what the difference is.
Validating is a subset of filtering.

To validate is to determine whether something is valid. For example:

Code: Select all

$isValid = ctype_alnum($_POST['username']);
Filtering adds to this by preventing invalid data:

Code: Select all

if (ctype_alnum($_POST['username'])) {
    /* Continue */
} else {
    /* Abort */
}
Hope that helps.
alpha2zee
Forum Newbie
Posts: 2
Joined: Wed Nov 07, 2007 3:07 pm

Post by alpha2zee »

Might be relevant

htmLawed, a highly customizable, 45 kb, single file, non-OOP PHP script to filter and purify HTML. Besides restricting tags/elements, attributes and URL protocols as per one's specification, and balancing HTML tags and ensuring valid tag nesting/well-formedness, it also has good anti-XSS and anti-spam measures.
Post Reply