Making data secure for display

Discussions of secure PHP coding. Security in software is important, so don't be afraid to ask. And when answering: be anal. Nitpick. No security vulnerability is too small.

Moderator: General Moderators

User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Making data secure for display

Post by batfastad »

Hi everyone

I'm gradually building a very small intranet application with PHP and MySQL.
There will only be 10 users initially but I'm hoping to eventually replace our company FileMaker database for managing company and contact information.

I've made sure that every variable from external sources is running through the following function:

Code: Select all

function mysql_clean($value) {
	// STRIPSLASHES
	if ( get_magic_quotes_gpc()) {
		$value = stripslashes($value);
	}

	return trim(mysql_real_escape_string($value));
}
So that's SESSION, GET, POST, COOKIE and any PHP internal vars.

That makes sure all data that users could enter is secured, and then PHP can deal with it safely.
But I'm worried about the potential of XSS or malicious HTML code being displayed on the script pages.

Is it enough to run the htmlspecialchars() function on any variables that will be output on the HTML page?

Or should I also run htmlspecialchars() on variables even if they're not going to be in the HTML output?

Thanks
Ben
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

There's generally no reason to run htmlspecialchars() on internal data, but it should be run on any data being displayed to the user which was not generated internally or should not be trusted, explicitly.
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

The function is wrong, here's the long story why: http://mordred.niama.net/blog/?p=122

Variables should be escaped right before being passed on (to mysql, user output, whatever), not when they come in. There should no differentiation between "internal" and "external" variables (unless it's somehow a feature, like allowing html in the title, specified in a config file)
jmut
Forum Regular
Posts: 945
Joined: Tue Jul 05, 2005 3:54 am
Location: Sofia, Bulgaria
Contact:

Post by jmut »

User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Post by batfastad »

I do run all variables through my mysql_clean() function I posted above, at the very start of scripts.
Is it my function that is wrong? What needs to change?


What I'm getting at with my original question is the pulling my data out of the database and outputting it on an HTML page.
I'm developing a small intranet company/contact information app and there are many companies / address fields etc that have & and quote marks " ' in for legitimate reasons

Now this can cause problems with the HTML code, especially quote marks " as they just break HTML tags.

So should I also be running htmlspecialchars() on any variables that I will be displaying on the HTML page?
To ensure the resulting HTML code remains valid and doesn't break say the <.input tag because a variable has a legitimate quote mark in it?
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Yes. Always escape your output using htmlspecialchars() or htmlentities(). You don't need to escape anything that you're not outputting to the page.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Post by batfastad »

Ok great.
So if I just run htmlspecialchars() or htmlentities() on any variables I will be displaying in the HTML page, I should be all set!

One thing I have noticed, is if there are variables that I plan on displaying within a <.textarea> element, I should use htmlentities().
As entities get rendered in <.textarea> elements, but not all of them get rendered in <.input> elements.

Is that correct?

That can cause problems if making a blog/CMS where you type &#xxx; to get an entity, but when you come to edit the record and save over it, the entity gets rendered in the <.textarea> and then gets overwritten in the DB with it's actual character.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

No, they act the same. However htmlentities() will entity-ize (is that a word?) all characters that have an entity, whereas htmlspecialchars() will only do < > ' " & (i think).

If you use htmlentities() to display data in an input or textarea field, upon submission of that form it will convert the entities such as & back to their character &, which makes it good for entry into your database. Then when you escape it to display again, it will be back in it's entitized form.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

batfastad wrote: Is it my function that is wrong? What needs to change?
Have you read the article I posted?

batfastad wrote: So should I also be running htmlspecialchars() on any variables that I will be displaying on the HTML page?
To ensure the resulting HTML code remains valid and doesn't break say the <.input tag because a variable has a legitimate quote mark in it?
htmlspecialchars() with ENT_QUOTES and the proper character set. Details:
viewtopic.php?t=75098
User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Post by batfastad »

Yeah that sounds right.

htmlspecialchars() for [s]most[/s] all vars
[s]Then htmlentities() when I'm outputting into a <.textarea> to avoid the entities being rendered to the chars then overwriting the entity code in the DB... pretty hard to explain what I'm getting at there, but I think you get what I mean[/s] :D

Thanks
Ben

EDIT: strikethrough the above bit which is wrong!
Last edited by batfastad on Tue Dec 04, 2007 9:38 am, edited 1 time in total.
User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Post by batfastad »

Mordred wrote:Have you read the article I posted?
Yes, but I don't know how to deal with magic_quotes globally as you said in the article.
Maybe I'm just not seeing the problem fully though :(
My function is modified from the one in the article from the manual... it doesn't do the !is_numeric check, and always runs the variable through mysql_real_escape_string()

I use my mysql_clean() function when I'm pulling any data from POST, GET, REQUEST, SERVER, COOKIE, SESSION
I don't use it on data that I've just extracted from my database, because I'm assuming that to be safe.
Mordred wrote:htmlspecialchars() with ENT_QUOTES and the proper character set. Details:
viewtopic.php?t=75098
Ok perfect. So that's for outputting vars into HTML code - so that quotes and stuff from data in DB doesn't go and break all your HTML tags.
Thanks for the link to that thread - very interesting and a much-needed blast of info :D

So the reason you don't need htmlentities() is because htmlspecialchars() also converts and ampersand, which stops the browser rendering the char within <.textarea> tags
So in the HTML source code the entity for a Euro € actually reads &#8364;... which basically double-escapes the ampersand and stops the entity from being rendered as a char by the browser.
Excellent!!

I think my confusion was I didn't realise that all that needed to be done was to change the ampersand. I thought there was something more that htmlentities did that wasn't covered by htmlspecialchars.
But since all you need is to sort that ampersand... htmlspecialchars will work perfectly!

Thanks
Ben
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

batfastad wrote: Yes, but I don't know how to deal with magic_quotes globally as you said in the article.
Basically, run stripslashes on all input arrays. Search this forum for magic_quotes and stripslashes, I hope you'll find enough examples.
batfastad wrote: I use my mysql_clean() function when I'm pulling any data from POST, GET, REQUEST, SERVER, COOKIE, SESSION
I don't use it on data that I've just extracted from my database, because I'm assuming that to be safe.
It's wrong.

Clean ALL variables BEFORE sending them to mysql.
It is wrong to do so for variables that will not go to mysql query.
It is wrong not to do so for variables will go to the query, even if they were previously pulled from the database.
User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Post by batfastad »

Ah ok :D

@Mordred - Thanks so much for your help so far :wink:... I really want to make sure I'm getting this right in my code :D

I do clean all variables before sending to MySQL... using my mysql_clean() function above.
But in some scripts I think I also run mysql_clean() on variables that aren't going to MySQL. I will change that!

Certainly any vars involved in a MySQL query, are escaped.


Going back to handling stripslashes and magic_quotes globally.
I have a config.php file which is included at the top of every PHP script in the application.
It has the mysql config details and any extra functions I've written that I can re-use frequently.

Do you mean it would be best to separate this into 2 different functions?

1) One function which I run over variables before they go to MySQL... eg:

Code: Select all

// MYSQL CLEAN
function mysql_clean($value) {
    return trim(mysql_real_escape_string($value));
}
2) Then something like this function which runs on every script page from my included config.php file:

Code: Select all

// INPUT MAKE-SAFE
function input_clean(&$value) {
    $value = stripslashes($value);
}

if ( ini_get('magic_quotes_gpc')) {
    array_walk_recursive($_GET, 'input_clean');
    array_walk_recursive($_POST, 'input_clean');
    array_walk_recursive($_COOKIE, 'input_clean');
}
Function from this page... http://www.phpguru.org/article.php/24

So the 2 issues are dealt with separately. THE G/P/C super-global input arrays get passed through that input_clean() function, automatically in my global config.php file on every single page.
Then additionally before I go into MySQL queries, I run any vars through mysql_clean()

Is that a better solution?
Are those functions sufficient?

Thanks
Ben
User avatar
batfastad
Forum Contributor
Posts: 433
Joined: Tue Mar 30, 2004 4:24 am
Location: London, UK

Post by batfastad »

Hi guys

Any opinions on my revised functions for making input data safe?
Separating it into 2 functions - one which runs stripslashes on all the input, automatically on the input arrays on each page load. And another which I use just before passing any data to MySQL.

Thanks
Ben
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Post by Mordred »

More or less correct, bar a few details.
I'm worried about the trim() in the mysql_clean() function, it shouldn't be there. This is a minor issue really, but may be a sign of a more serious problem. Could it be that you don't put quotes around value tokens in your MySQL queries?
Post Reply