Filtering contact form input
Moderator: General Moderators
Filtering contact form input
Processing forms was so much easier before I started worrying about security!! I'm seeing a lot of warnings that I should filter input, but not a lot of details about how to do that. Since I don't know what the "bad" input might be, I'm not sure how to filter it out.
For example, what is the best regex pattern to use to validate legitimate names without causing security problems in an email form? When possible, I want to allow perfectly legitimate names such as "O'Reilly", "Mary & Joseph", "Mary Smith-Jones", "Tom Jones, Jr." or 'Tom ("Bud") Jones'. I'm thinking of using /^[a-zA-Z\'\ \&\-\,\.\"\(\)]+$/ to validate the name. Would this cause any security issues when using the name in mail()?
Also, what kinds of input should I be filtering out of the message? Since I don't know what the expected input would be in that field, I would need to know what not to allow. Any good regex for that?
For example, what is the best regex pattern to use to validate legitimate names without causing security problems in an email form? When possible, I want to allow perfectly legitimate names such as "O'Reilly", "Mary & Joseph", "Mary Smith-Jones", "Tom Jones, Jr." or 'Tom ("Bud") Jones'. I'm thinking of using /^[a-zA-Z\'\ \&\-\,\.\"\(\)]+$/ to validate the name. Would this cause any security issues when using the name in mail()?
Also, what kinds of input should I be filtering out of the message? Since I don't know what the expected input would be in that field, I would need to know what not to allow. Any good regex for that?
- Kieran Huggins
- DevNet Master
- Posts: 3635
- Joined: Wed Dec 06, 2006 4:14 pm
- Location: Toronto, Canada
- Contact:
I've heard there's something called HTML Purifier.... might be what you're looking for?
- aaronhall
- DevNet Resident
- Posts: 1040
- Joined: Tue Aug 13, 2002 5:10 pm
- Location: Back in Phoenix, missing the microbrews
- Contact:
I wouldn't worry about filtering someone's given name... who are you to say what a valid name is?
If you plan on inserting user input into MySQL, sanitize with mysql_real_escape_string(). If you want to output user input onto one of your pages, and that input is not supposed to contain HTML, use htmlspecialchars(). If you want to output HTML, use HTML Purifier.
If you plan on inserting user input into MySQL, sanitize with mysql_real_escape_string(). If you want to output user input onto one of your pages, and that input is not supposed to contain HTML, use htmlspecialchars(). If you want to output HTML, use HTML Purifier.
I thought this was the Security forum. Check out Kieran Huggins' post at viewtopic.php?t=72721 for a great illustration (literally) of why. We can't assume that the input in a name field is actually a name, and not malicious code. I've at least learned that much, so far.aaronhall wrote:I wouldn't worry about filtering someone's given name... who are you to say what a valid name is?
My question was actually about using the input in an email, where HTML entities don't get translated.If you plan on inserting user input into MySQL, sanitize with mysql_real_escape_string(). If you want to output user input onto one of your pages, and that input is not supposed to contain HTML, use htmlspecialchars().
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
You can validate or you can filter:I tend to filter and escape for security and validate for correctness, but you can do it either way.
You always need to use the appropriate escape function to escape the output.
Code: Select all
$name = preg_replace('/[ ^a-zA-Z\'\ \&\-\,\.\"\(\)]/', '', $_POST['name']);You always need to use the appropriate escape function to escape the output.
(#10850)
Sorry, I guess I said "validate" when I meant "filter" in one spot. I'm not sure what the difference is. In any case, that brings me back to my original question: Would using that regex to filter a name cause any security issues when using the name in mail()? In other words, am I being too generous in what I allow? And what is "the appropriate escape function" when input is being emailed?
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
The filter I showed above is a character whitelist. That is one thing to do and it can reduce the amount of validation you need to follow up with. You still want to validate that text you are getting is in the format expected. Escaping depends on what it is and where it is going. You would not want to escape email addresses, for example, because they must contain all valid characters, but you would escape the subject or body of the email.
(#10850)
Validating is a subset of filtering.I said "validate" when I meant "filter" in one spot. I'm not sure what the difference is.
To validate is to determine whether something is valid. For example:
Code: Select all
$isValid = ctype_alnum($_POST['username']);Code: Select all
if (ctype_alnum($_POST['username'])) {
/* Continue */
} else {
/* Abort */
}Might be relevant
htmLawed, a highly customizable, 45 kb, single file, non-OOP PHP script to filter and purify HTML. Besides restricting tags/elements, attributes and URL protocols as per one's specification, and balancing HTML tags and ensuring valid tag nesting/well-formedness, it also has good anti-XSS and anti-spam measures.
htmLawed, a highly customizable, 45 kb, single file, non-OOP PHP script to filter and purify HTML. Besides restricting tags/elements, attributes and URL protocols as per one's specification, and balancing HTML tags and ensuring valid tag nesting/well-formedness, it also has good anti-XSS and anti-spam measures.