Page 1 of 1

[Solved] Regex to remove invalid Windows filesystem chars

Posted: Sun Jul 23, 2006 4:54 am
by daedalus__
I'm working on an error handling class, which I will be making a topic about later (look for it in theory).

I want to give the option to log errors, if none are fatal, to a file. In order to do this, I need to be able to remove characters that are invalid in the Windows filesystem from the path and filename of the log.

Here is my problem:

I am so bad with regular expressions that I can't even check for word characters without using \w (\W?).

I've been searching Google and the forums for a while but I can't turn anything up.

EDIT: I almost forgot, perl compatible, if you could. (preg_)

Posted: Sun Jul 23, 2006 5:33 am
by daedalus__
omfsweetjesus

I got it!!!!!!!!!!!!!!!!!!!

Code: Select all

$string = 'this(is_an_invalid)file&name.lame';
echo $string.'<br />';
echo '<p>'.preg_replace('/[^a-zA-Z0-9._]/', '', $string).'</p>';
Outputs:

Code: Select all

<p>thisis_an_invalidfilename.lame</p>
!!!!!!!!!!!!!!!!!!!!!

"Replace any character that is not a through Z, a period, or an underscore."

I searched for almost 40 minutes before I tried to do it myself.

Posted: Sun Jul 23, 2006 5:43 am
by daedalus__
[^a-zA-Z0-9._\/] preserves the fowardwack in path names as well.

Posted: Sun Jul 23, 2006 7:27 am
by Chris Corbyn
Daedalus- wrote:[^a-zA-Z0-9._\/] preserves the fowardwack in path names as well.

Code: Select all

[^a-zA-Z0-9\._\/\-]
Didn't know you could have foreard slashes but you forgot to escape the dot which is "any" character ;) I put dash in there too. So that condenses down to:

Code: Select all

[^\w\.\/\-]
Since \w is the same as [a-zA-Z0-9_] ;)

Posted: Sun Jul 23, 2006 8:55 am
by feyd
I hate to keep repeating this, but \w is not the same as a-zA-Z0-9_, it covers far more characters than just those.

Posted: Sun Jul 23, 2006 9:46 am
by Chris Corbyn
feyd wrote:I hate to keep repeating this, but \w is not the same as a-zA-Z0-9_, it covers far more characters than just those.
Does it cover UTF-8 characters too or something? Like accented letters? This is new to me :)

Posted: Sun Jul 23, 2006 9:49 am
by feyd
d11wtq wrote:Does it cover UTF-8 characters too or something? Like accented letters? This is new to me :)
I guess you missed my reply to another of your recommendations to use \w before: viewtopic.php?p=245010#245010

Posted: Sun Jul 23, 2006 10:07 am
by Chris Corbyn
feyd wrote:
d11wtq wrote:Does it cover UTF-8 characters too or something? Like accented letters? This is new to me :)
I guess you missed my reply to another of your recommendations to use \w before: viewtopic.php?p=245010#245010
Yeah sorry I did. Interesting. Thanks :)

Posted: Sun Jul 23, 2006 11:51 am
by daedalus__
I thought that the period didn't have to be escaped since it is inside of the bracket deals.

It works fine?