Page 1 of 1

Regular Expression Help

Posted: Wed Jun 25, 2003 4:01 pm
by m3rajk
[Admin Edit: Moved from viewtopic.php?p=46776]

this seems like a good place to post this....

Code: Select all

<?php

$input=stripslashes(rawurldecode($_GET['input']));

# $test=eregi_replace('<([[]+]?script[[]+[]*]?)>', '<\1>', $input);

 $test=eregi_replace('<([[]+]?/?script[[]+[]*]?)>', '<\1>', $input);


?>last input: <?php echo $input; ?>
<br> after eregi: <?php echo $test; ?><br />
<p><form action="<?php $_SERVER[PHP_SELF]; ?>">
<br><input type="text" size="50" name="input"><br />
<br><input type="submit"><br /></p>
i'm trying to get it to replace all instances of < and > as html tags in this:
<script>test one</script>this is a test< script >test 2 </script> this is a test<script language="javascript">test 3</script>

i expect most ppl using my site will use ie, so i'm testing it there...
in mozilla it works on everything bu the one where there' sa language and then mozilla adds a close tag itself on the redisplay.

in ie it's worse.. the first eregi only does the frist instance.

the second one only does something quite different.

the second one in mozilla:
last input: this is a test< script >test 2 this is a test
after eregi: <script>test one</script>this is a test< script >test 2 </script> this is a testtest 3</script>
in mozilla you get to enver new output... in ie....
last input: this is a test< script >test 2 this is a test
after eregi: <script>test one</script>this is a test< script >test 2 </script> this is a test
and you don't get to add output.. which means it's either the same and mozilla's smart enough to close it, or something else.. but it's failing at the same place. if i could get some help..namely an explantion why, i would appreciate it

Posted: Wed Jun 25, 2003 4:52 pm
by Flood
Hi!


First of all, trying to do cross-site scripting with < script> will never work, since by default there should not be any space between the first < and the tag name... But it is of course only a matter of [:space:] to add...

Then what I suggest for your regular expression is
$test=eregi_replace('<(/?script[^>]*)>', '<\1>', $str);

It works at least for the expression you took as an example.

Hope it helps.

/Flood

Posted: Wed Jun 25, 2003 4:56 pm
by m3rajk
actually what i'm trying to do is allow html to be input into a field and disable the use of any scripting language since i don't know the user and can't be sure it's not going to be malevolent. once i get ti to remove <script></script> and <script language=language></script> i'm going to modify that line to remove <% %> and <?php ?> and <? ?>

as far as i know that will prohibit all scripts

the reason i added the test for spaces is because i know that html is rather forgiving and don't know if it'll let someone do < font color="#c8ff00" > or tabs or return carriages.

Posted: Wed Jun 25, 2003 5:01 pm
by Flood
Just a question... I might be mistaking of course... Do you really need to check that <? ?>, <?php ?>, <% %> have not been entered in the form fields?

/Flood

Posted: Wed Jun 25, 2003 5:04 pm
by m3rajk
yeah. i don't know what level the people are going to be at and if i give them html abilities and the site is in php, then i wanna be sure they don't try to be funky with php

Posted: Wed Jun 25, 2003 5:06 pm
by Flood
I understand it but try something like:

<?php print "You have entered {$_POST['text']}"; ?>

<form method="post" action="machin.php">
<input type="text" name="text" value="<? print $_POST['text']; ?>" />
</form>

Perhaps you can make harm by entering PHP code within the item "text" but I do not know how... I'd like some other opinions about it.

The only thing you need to do, accordind to me, is maybe to take care of the " and '.... especially if you have a database behind all this.

In the worst case, why not using strip_tags? Especially if you do know the tags you want to allow...

/Flood

Posted: Wed Jun 25, 2003 6:10 pm
by m3rajk
becasue i want to allow as much as possible without compromising the integrity of the page. after removing the specific things i want to remove, all the scripting styles, then i'm going to use addslashes so that ' and " will end up ok for insertion into mysql.

btw: i can't find how to refer to ? in a expression string... and would this remove uneded php/asp tags?

Code: Select all

<?php

$input=stripslashes(rawurldecode($_GET['input']));

$test=eregi_replace('<(/?script[^>]*)>', '<\1>', $input);
$test2=eregi_replace('<(%[^>]*%)>', '<\1>', $test);
$test3=eregi_replace('<(?[^>]*?)>', '<\1>', $test2);

?> last input: <?php echo $input; ?>
<br> after eregi series: <?php echo $test2; ?><br />
<p><form action="<?php $_SERVER[PHP_SELF]; ?>">
<br><input type="text" size="50" name="input"><br />
<br><input type="submit"><br /></p>

Posted: Wed Jun 25, 2003 6:16 pm
by Flood
Special characters ==> put a \ before them

-->

Code: Select all

$test3=eregi_replace('<(\?&#1111;^>]*\?)>', '<\1>', $test2);
/Flood

Posted: Thu Jun 26, 2003 2:28 pm
by twigletmac
Please don't hijack other people's threads - do post a link to their thread in your own if it's relevant but posting in theirs is not fair.

Mac

Posted: Thu Jun 26, 2003 2:41 pm
by m3rajk
sorry. i figured they werein the same vein and that those helpin him would be likely to help me. didn't think of it as or try to hijack the thread