PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
Moderator: General Moderators
ed209
Forum Contributor
Posts: 153 Joined: Thu May 12, 2005 5:06 am
Location: UK
Post
by ed209 » Wed Feb 08, 2006 6:45 am
Hi,
I'm stuck with getting a nice string out the other end of this function. I'm using it to create a file name so I only want alphanumeric characters to be used.
The $string comes from something that a user might input - which may contain ", ; : @ etc. In the place of these I want to use "_".
I have a function that works, but I can't figure out how to remove excessive ______
Code: Select all
<?php
function removeNonAlphaNum($string){
//$string = stripslashes($string);
$previous_i = 0;
$string_length = strlen($string);
$returned_string = "";
for( $i = 0 ; $i <= $string_length; $i++){
$sub_string = substr($string, $previous_i, 1);
$returned_string .= (ctype_alnum($sub_string)) ? $sub_string : "_";
$previous_i += 1;
}
return $returned_string;
}
echo removeNonAlphaNum("I am a non-alphaNumeric string £232!@£$$...");
// outputs
// I_am_a_non_alphaNumeric_string__232_________
//I want it to output
//I_am_a_non_alphaNumeric_string_232
?>
any ideas?
JayBird
Admin
Posts: 4524 Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:
Post
by JayBird » Wed Feb 08, 2006 6:53 am
not a direct answer to your question, but something like this for removing the characters may have been neater
Code: Select all
$string = "I am a non-alphaNumeric string £232!@£$$...";
$new_string = ereg_replace("[^A-Za-z0-9]", "_", $string);
echo $new_string;
ed209
Forum Contributor
Posts: 153 Joined: Thu May 12, 2005 5:06 am
Location: UK
Post
by ed209 » Wed Feb 08, 2006 7:31 am
Thanks for that, yours is quicker too.
for 1000 executions:
Time : 0.6401 seconds (mine)
Time : 0.0542 seconds (yours)
But I still have the problem of too many '_____' . Is there a way to only ever have one '_' in a row?
Thanks,
ed.
JayBird
Admin
Posts: 4524 Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:
Post
by JayBird » Wed Feb 08, 2006 8:14 am
this nearly works, but i leaves an undescore on the end
Code: Select all
$string = "I am a non-alphaNumeric string £232!@£$$...";
$new_string = ereg_replace("[^A-Za-z0-9]", "_", $string);
$final_string = ereg_replace("[_]+", "_", $new_string);
echo $final_string;
Last edited by
JayBird on Wed Feb 08, 2006 8:14 am, edited 1 time in total.
Benjamin
Site Administrator
Posts: 6935 Joined: Sun May 19, 2002 10:24 pm
Post
by Benjamin » Wed Feb 08, 2006 8:14 am
Code: Select all
function cleanup_filename($filename_to_clean)
{
// we use dashes because underscores will not wrap in a table!!!!
// the following array contains everything we want to remove from a color field
$invalid_in_filename = array("#", "~", "`", "!", "@", "$", "%", "^", "&", "*", "(", ")", "=", "+", "<", ",", ">", "/", "?", "\"", "'", ";", ":", "{", "[", "]", "}", "|", "\\");
//remove invalid characters
$replace_with = "";
$filename_to_clean = str_replace($invalid_in_filename,$replace_with,$filename_to_clean);
// convert underscores to dashes
$filename_to_clean = str_replace("_","-",$filename_to_clean);
// convert spaces to dashes
$filename_to_clean = str_replace(" ","-",$filename_to_clean);
// get rid of multiple dashes (i.e. "--")
while (substr_count($filename_to_clean, "--") > 0)
{
$filename_to_clean = str_replace("--","-",$filename_to_clean);
}
return $filename_to_clean;
}
JayBird
Admin
Posts: 4524 Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:
Post
by JayBird » Wed Feb 08, 2006 8:17 am
The above function will still leave an eronous dash on the end of the filename.
Dunno if that is acceptable for your application or not.
Also, the above function lists what isn't allowed in the filename...better to specify what IS allowed IMO
Benjamin
Site Administrator
Posts: 6935 Joined: Sun May 19, 2002 10:24 pm
Post
by Benjamin » Wed Feb 08, 2006 8:23 am
It's just something I had laying around so I figured I would throw it up there.
Getting rid of the last character is easy...
Code: Select all
$trimmed = rtrim($text, "..\-..\_");
Not sure if that code is right but it's close to that.
ed209
Forum Contributor
Posts: 153 Joined: Thu May 12, 2005 5:06 am
Location: UK
Post
by ed209 » Wed Feb 08, 2006 8:24 am
thanks for your help, problem solved.
I'll give them both a go. The file name isn't the end of the world, it just needs to resemble the title of the page.
feyd
Neighborhood Spidermoddy
Posts: 31559 Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA
Post
by feyd » Wed Feb 08, 2006 9:07 am
psst. preg_ would run faster!
JayBird
Admin
Posts: 4524 Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:
Post
by JayBird » Wed Feb 08, 2006 9:12 am
agtlewis wrote: It's just something I had laying around so I figured I would throw it up there.
Getting rid of the last character is easy...
Code: Select all
$trimmed = rtrim($text, "..\-..\_");
Not sure if that code is right but it's close to that.
It would be a little more than that, becuase you would only want to remove the last character if the last charater was an underscore
Benjamin
Site Administrator
Posts: 6935 Joined: Sun May 19, 2002 10:24 pm
Post
by Benjamin » Wed Feb 08, 2006 9:23 am
That is what rtrim does according to what I understood from the documentation. You supply a list of characters to strip.
feyd
Neighborhood Spidermoddy
Posts: 31559 Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA
Post
by feyd » Wed Feb 08, 2006 9:28 am
here's the preg version of pimp's last submission with the cleaning as requested:
Code: Select all
$new_string = rtrim(preg_replace("#[^a-z0-9]+#", "_", $string),'_');
Christopher
Site Administrator
Posts: 13596 Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US
Post
by Christopher » Wed Feb 08, 2006 12:46 pm
I would recommend the preg "remove characters not in set" method (e.g. preg_replace('/[^a-z0-9]/', '', $mystring) ) of specifying the characters that you want rather than attempting to remove bad characters. The latter method inevitably misses something.
(#10850)