Input filtering -- really nessecary
Moderator: General Moderators
-
Bruno De Barros
- Forum Commoner
- Posts: 82
- Joined: Mon May 12, 2008 8:41 am
- Location: Ireland
Re: Input filtering -- really nessecary
There is, from my point of view, a solution to your "removing <script> tags". Have two fields on your database. ORIGINAL_INPUT and FILTERED_INPUT. The original keeps the script tags, in case you ever want to check them. The filtered keeps the input, already filtered and formatted for showing in templates.
Just a suggestion. In my head, if 100 people go to the same page, and the same calculations and processes have to be carried out, ALWAYS, to give the SAME output, it's better to cache them, somehow. It's like using GD to make an image from a saved user input. It's better to generate once and then cache, and refresh when the user changes his input, than to generate all the time.
Just a suggestion. In my head, if 100 people go to the same page, and the same calculations and processes have to be carried out, ALWAYS, to give the SAME output, it's better to cache them, somehow. It's like using GD to make an image from a saved user input. It's better to generate once and then cache, and refresh when the user changes his input, than to generate all the time.
Re: Input filtering -- really nessecary
Not wise. This makes the hidden assumption that the data flows only this way: input->database->output. This is not always the case, so making this on a general level will lead to hard to follow bugs. There are single cases when this method is desirable, and they should be an exception, clearly documented as such and with extra care on the dataflow around this functionality.pytrin wrote:Using htmlspecialchars on data just before in enters the database, and using it on unfiltered data in after its retrieved from the database will have the exact same output. However in the former, htmlsepcialchars is executed only once instead of per page view.
This might not be a big deal with htmlspecialchars (which I personally don't use, it was brought up by Hockey's example), but with more complete filtering packages there is a substantial difference in performance.
Not very secure, use all three parameters.Hockey wrote:I call a filtering function inside my templates.Code: Select all
<input type="text" value="<?php echo htmlspecialchars($subject); ?>>" />
Re: Input filtering -- really nessecary
This is a not a hidden assumption, its a very straightforward assumption... As I said earlier in this thread, unless you consider that your database might be compromised (which then means you have a much bigger problem on your hands), this is in my opinion the right assumption to make.Mordred wrote:
Not wise. This makes the hidden assumption that the data flows only this way: input->database->output. This is not always the case, so making this on a general level will lead to hard to follow bugs.
Not sure what you meant about hard to follow bugs, how is that relevant to escaping?
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Input filtering -- really nessecary
The more I think about this -- the more I think that filtering is probably a good thing for programmers to always do. Especially if you are not positive about the security issues. There are really very few input fields that need a large character set entered. For a standard address form all the fields have limited character sets (names/streets [a-zA-Z0-9\ \-\'\,\#], email [a-zA-Z0-9\-\_\.\@], etc.). If you always filter in addition to escaping, I think you are providing yourself an extra layer of defense in case you forget something elsewhere. As Mordred points out, the problem can be long after, and in totally separate code from the form.
(#10850)
Re: Input filtering -- really nessecary
A page is generated from three data sources: an rss (xml), some database fields, and user input (with magic quotes). The IP of the user is checked in a third party ip2location database. All this data is filled in a form, which on submission is placed in the database and then displayed back, this time not in a form (think: "your submission blablabla was successful"). Pray, do solve this with your "straightforward assumption". Things are not always as simple as the laboratory conditions you imagine.pytrin wrote:This is a not a hidden assumption, its a very straightforward assumption...Mordred wrote: Not wise. This makes the hidden assumption that the data flows only this way: input->database->output. This is not always the case, so making this on a general level will lead to hard to follow bugs.
It is a hidden assumption. It is not stated by anything in the code. It is a wrong assumption. There are many other data sources and outputs which require different escapings.
Failed escaping causes dataflow bugs (double escaping, no escaping, wrong kind of escaping). Mixed datasources are a classical testcase for that. Escaping is not a security measure.pytrin wrote:Not sure what you meant about hard to follow bugs, how is that relevant to escaping?
For every situation, there is one single correct way to escape data, and moreover it must be done immediately before the data is used in the specific way that dictates this particular escaping (i.e. echo, mysql_query, etc) You can theoretically achieve correct results with other schemes, but only with conscious effort and many many possible points of failure when there are mixed datasources (even the most trivial case of data coming from the database can be "mixed datasource"). This is why I call this "unwise" and "causing hard to follow bugs".
Re: Input filtering -- really nessecary
Not sure where are you going with this... I think this discussion was about securing your own applications, not somebody else's. In your application you should be well aware of the exact flow of input -> database -> output and your decisions are not based on 'assumptions' but on facts.
If you are planning a security system for somebody's else's code, what you said might be relevant. Otherwise I don't see it.
If you are planning a security system for somebody's else's code, what you said might be relevant. Otherwise I don't see it.
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Input filtering -- really nessecary
I think Mordred is pointing out something important. A security measure at a specific point is not necessarily an application security measure. That is why it is dangerous to think of it in general. Escaping data when saving to a database may help with SQL injection, but it does nothing for XSS attacks if that data is later displayed as HTML. So how you think about it is important. Security is in part a mind-set. And the solutions must be in depth and systemic.
(#10850)
Re: Input filtering -- really nessecary
I'd have to say I agree with mordred on this...
Re: Input filtering -- really nessecary
If you know the source of the data and trust it (your database) I don't see how escaping it before insertion or after retrieval makes any difference (unless ofcourse your database is compromised as I've said before). If you need to display data as HTML, stronger escaping mechanism will need to be used, such as HTMLPurifier. Again, it is still better in my opinion to use it on the data as it goes in the database (once) and not on the way out (many times).arborint wrote:I think Mordred is pointing out something important. A security measure at a specific point is not necessarily an application security measure. That is why it is dangerous to think of it in general. Escaping data when saving to a database may help with SQL injection, but it does nothing for XSS attacks if that data is later displayed as HTML. So how you think about it is important. Security is in part a mind-set. And the solutions must be in depth and systemic.
- Christopher
- Site Administrator
- Posts: 13596
- Joined: Wed Aug 25, 2004 7:54 pm
- Location: New York, NY, US
Re: Input filtering -- really nessecary
I think the point is to get in the habit of never trusting any source. That is the essence of Defense in Depth. You may trust the source now, and that trust may be well founded. But in a year things may have changed without you being aware of the problem. Defense in Depth just acknowledges that we are not perfect.
(#10850)
Re: Input filtering -- really nessecary
There are other considerations - such as performance. If you are using a heavy filtering solution such as HTMLPurifier in order to display user-inputted HTML, it becomes a bottleneck if you need to apply it for every page view.
The site itself recommends using it on the way and not on the way out.
Regarding changes in the datasource - if you are changing the datasource to an untrusted source, that might a good time to add filtering on the output as well. Seems like an extreme edge-case to me (you are moving your data from internal to external? how often does that happen?)
The site itself recommends using it on the way and not on the way out.
Regarding changes in the datasource - if you are changing the datasource to an untrusted source, that might a good time to add filtering on the output as well. Seems like an extreme edge-case to me (you are moving your data from internal to external? how often does that happen?)
Re: Input filtering -- really nessecary
pytrin, I gave you a challenge for your argument, don't stray from it, and you'll hopefully see for yourself that your input->db->output model is oversimplified.
This is not about trust, it is about data consistency. If in your "trusted" source, the database, half the data is pre-escaped for output and half is not, you severely increase the chance for error - either double escaping or no escaping it. Also you will make it much harder if you want a different output (i.e. csv export) for your escaped-for-html data. A good programming methodology will try to avoid such potential pitfals. Correctly escaping when the time is right is one such thing.
Don't try this "performance" argument - I already agreed that sometimes you will need a couple of items to be pre-escaped - I've done it for message board posts for example. This is only a specific case, and it is an exception, a compromise, in no way representative of the problem being discussed. I will repeat:
This is not about trust, it is about data consistency. If in your "trusted" source, the database, half the data is pre-escaped for output and half is not, you severely increase the chance for error - either double escaping or no escaping it. Also you will make it much harder if you want a different output (i.e. csv export) for your escaped-for-html data. A good programming methodology will try to avoid such potential pitfals. Correctly escaping when the time is right is one such thing.
Don't try this "performance" argument - I already agreed that sometimes you will need a couple of items to be pre-escaped - I've done it for message board posts for example. This is only a specific case, and it is an exception, a compromise, in no way representative of the problem being discussed. I will repeat:
Mordred wrote:For every situation, there is one single correct way to escape data, and moreover it must be done immediately before the data is used in the specific way that dictates this particular escaping.
Re: Input filtering -- really nessecary
Basically you are saying there is no one solution that fits all the use-cases, which I would agree. I didn't stray from your sentiments regarding data integrity, I simply found it to be over-critical for securing your own applications. You obviously approach it from a different place than me - probably due to your background in web security.