regex for filtering non-standard charachters

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
jazz090
Forum Contributor
Posts: 176
Joined: Sun Apr 12, 2009 3:29 pm
Location: England

regex for filtering non-standard charachters

Post by jazz090 »

i got this <textarea> that is meant to accept around 350 words, now the audeince of this website will be pasting it from word more than 99% of the time which creates a problem for me becuase of the whole rendering system used by word that converts quotes to “”. but thats only small part of the problem because im filtering them out:

Code: Select all

$data = str_replace("‘", "'", $data);
$data = str_replace("’", "'", $data);
$data = str_replace("“", '"', $data);
$data = str_replace("”", '"', $data);
$data = str_replace("–", "-", $data);
but what about the rest, you got all these symbols in word such as alpha-beta, neq etc... as they just mess up in mysql and even one occasion when i combines symbols and (“”) in the textarea and processed it, the above code was completly ignored and the word engulfed by (“”) was completly denatured. i need some sort of regex to get rid of these unwanted characters, any regex or wisdom is appreciated.. safe
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: regex for filtering non-standard charachters

Post by prometheuzz »

Regex is not the way you'd want to go. Have a look at http://nl3.php.net/mysql_real_escape_string instead.
User avatar
jazz090
Forum Contributor
Posts: 176
Joined: Sun Apr 12, 2009 3:29 pm
Location: England

Re: regex for filtering non-standard charachters

Post by jazz090 »

i know about this function, and when i say it will mess up the mysql, i mean after being escaped by the afformentioned mysql_real_escape_string().
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: regex for filtering non-standard charachters

Post by prometheuzz »

jazz090 wrote:i know about this function, and when i say it will mess up the mysql, i mean after being escaped by the afformentioned mysql_real_escape_string().
Well, then your original post lacked quite some information.

Best of luck of course.
User avatar
jazz090
Forum Contributor
Posts: 176
Joined: Sun Apr 12, 2009 3:29 pm
Location: England

Re: regex for filtering non-standard charachters

Post by jazz090 »

a better question would be, how do u use regex to undentify non-ascii chacrters?
User avatar
GeertDD
Forum Contributor
Posts: 274
Joined: Sun Oct 22, 2006 1:47 am
Location: Belgium

Re: regex for filtering non-standard charachters

Post by GeertDD »

jazz090 wrote:a better question would be, how do u use regex to undentify non-ascii chacrters?
Kohana contains three ascii related functions:
  • is_ascii()
  • strip_ascii_ctrl()
  • strip_non_ascii()
View the code here: http://dev.kohanaphp.com/browser/tags/2 ... 8.php#L135
Post Reply