Page 1 of 1

preg_match on multiple words?

Posted: Mon Oct 18, 2010 6:09 am
by simonmlewis
Hi

I have a script using preg_match to find a term. But I need to make it find one that one term.

Code: Select all

if (preg_match("/http/i", "$usercomments")) { echo "<div class='admincompletedbox'>Disallowed Comment.</div>";}
This is fine.

But what if I want to prevent HTML code, such as '<p>' or '<font' for example? I don't see a way to put in multiples into that area where I have http/i at the moment.

Re: preg_match on multiple words?

Posted: Mon Oct 18, 2010 7:15 am
by requinix
If you're just matching a simple word or string, use stripos.

Two paths you can take for HTML:
1. Let them enter whatever they want, and run the text through htmlentities before displaying it. This means no HTML allowed, but is very easy to implement and doesn't need extra validation steps or warning messages.
2. Filter out HTML with strip_tags. Allows some HTML.

I prefer #1 myself. They want to try entering HTML? Go ahead and let them - it won't work.

Re: preg_match on multiple words?

Posted: Thu Nov 04, 2010 10:40 am
by simonmlewis
Hi
I can't quite believe I never replied to this message of yours.

At the time and even today, I need to find a way of letting them enter whatever they want, but prevent certainly terms.

ie. stopping: HTTP, font, and a few other codes that pass thru from pasting from MS Word.
Also it will be useful on another site to stop bad language and email address etc.

preg_match doesn't do multiples from what I can see.

Re: preg_match on multiple words?

Posted: Thu Nov 04, 2010 12:26 pm
by requinix
simonmlewis wrote:preg_match doesn't do multiples from what I can see.
Regular expressions sure can do multiples. It's called "alternation".

Code: Select all

#</?(font|span|script|style|...)[^>]*>#i
But still, don't think this is the best way to do this.
- If you're afraid of people copying stuff from Word because of the stupid HTML it inserts, tell them not to copy stuff from Word.
- If you let them insert whatever they want, how will you know whether they didn't add the "stupid HTML" themselves intentionally?
- If you want to filter out some tags then you're 99% better off thinking of a list of valid tags and removing everything not in the list - not the opposite.
- If you want to remove certain tags then you'll have a hard time. You hear about what happened years ago with MySpace and <script> tags?

Re: preg_match on multiple words?

Posted: Thu Nov 04, 2010 12:32 pm
by simonmlewis
Sorry, don't quite understand your code.

I appreciate that it is basically listing the words, but not sure how.

According to a PHP site, Regular Expression Match was killed off. I can't find Regular Expression Alternation in PHP tho.

Re: preg_match on multiple words?

Posted: Thu Nov 04, 2010 1:49 pm
by requinix
simonmlewis wrote:Sorry, don't quite understand your code.

I appreciate that it is basically listing the words, but not sure how.
What I gave is the regular expression. Basically just plug that into the preg_match you have in your code.

Code: Select all

font|span|script|style|...
If you want to add more HTML tags to it, just put the tag name in there. Each one is separated by a pipe (|).
simonmlewis wrote:According to a PHP site, Regular Expression Match was killed off. I can't find Regular Expression Alternation in PHP tho.
Uh... either you're misunderstanding or they're just plain wrong, and - no offense - I would bet it's the former. Where did you see this?