Regex help removing multiple commas
Moderator: General Moderators
Regex help removing multiple commas
On my forum I have the following few functions that will remove extra ! . or ? and replace them with only 1.
$message_array[$x] = preg_replace("/([\!])+/", "\\1", $message_array[$x]);
$message_array[$x] = preg_replace("/([\?])+/", "\\1", $message_array[$x]);
$message_array[$x] = preg_replace("/([\.])+/", "\\1", $message_array[$x]);
For example the following text:
Hello!!! What.. is your name???? will become Hello! What. is your name?
Now I want to do the same with commas, so I added a 4th line
$message_array[$x] = preg_replace("/([\,])+/", "\\1", $message_array[$x]);
but this does not work as expected. Any extra commas still remain in the text. I wasn't sure it needed to be escaped or not so I tried it without \ and still no dice. I even tried setting a variable equal to chr(44) and then passing the var, but that didn't work either.
Can someone tell me what I'm doing wrong here? I'm guessing it has something to do with commas being used to separate parameters. Thanks for any help.
$message_array[$x] = preg_replace("/([\!])+/", "\\1", $message_array[$x]);
$message_array[$x] = preg_replace("/([\?])+/", "\\1", $message_array[$x]);
$message_array[$x] = preg_replace("/([\.])+/", "\\1", $message_array[$x]);
For example the following text:
Hello!!! What.. is your name???? will become Hello! What. is your name?
Now I want to do the same with commas, so I added a 4th line
$message_array[$x] = preg_replace("/([\,])+/", "\\1", $message_array[$x]);
but this does not work as expected. Any extra commas still remain in the text. I wasn't sure it needed to be escaped or not so I tried it without \ and still no dice. I even tried setting a variable equal to chr(44) and then passing the var, but that didn't work either.
Can someone tell me what I'm doing wrong here? I'm guessing it has something to do with commas being used to separate parameters. Thanks for any help.
-
Robert Plank
- Forum Contributor
- Posts: 110
- Joined: Sun Dec 26, 2004 9:04 pm
- Contact:
Your regex worked fine for me (without the backslash). By the way if you want to get that all on one line
Code: Select all
<?php
$string = "Hello!!! What.. is,,,,, your name????";
$string = preg_replace("/([!\?,\.])+/", "\\1", $string);
echo $string;
?>You don't need to escape the ? and the . as they are not pcre meta characters with a charater group:Robert Plank wrote:Your regex worked fine for me (without the backslash). By the way if you want to get that all on one line
Code: Select all
<?php $string = "Hello!!! What.. is,,,,, your name????"; $string = preg_replace("/([!\?,\.])+/", "\\1", $string); echo $string; ?>
/([!?,.])+/
will work fine.
Sorry to bug again but I just noticed something. It will only remove multiple commas if there also are ! . or ? in the text. If there are only commas then it doesn't remove the extra ones. If I try the same thing with just ! . or ? it will remove them as expected. It's certainly strange behavior.
I'll show some examples of what I tried and what the result was:
Example 1 (Bad Result)
Input: test,,,
Result: test,,,
Example 2
Input: test...
Result: test.
Example 3
Input: test!!!
Result: test!
Example 4
Input: test???
Result: test?
Example 5
Input: test,,, test???
Result: test, test?
Example 6 (Bad Result)
Input: test,,, test,,,
Result: test,,, test,,,
Example 7 (Bad Result)
Input: test test,,,
Result: test test,,,
Example 8
Input: test test!! test,,,
Result: test test! test,
Example 9
Input: ...test test,,,
Result: .test test,
Example 10
Input: test! test. test? test,,,
Result: test! test. test? test,
I'm at a loss here as to why, when there are only commas, that it doesn't remove the extras. I tried switching the order that each gets replaced, but that didn't change anything.
I'll show some examples of what I tried and what the result was:
Example 1 (Bad Result)
Input: test,,,
Result: test,,,
Example 2
Input: test...
Result: test.
Example 3
Input: test!!!
Result: test!
Example 4
Input: test???
Result: test?
Example 5
Input: test,,, test???
Result: test, test?
Example 6 (Bad Result)
Input: test,,, test,,,
Result: test,,, test,,,
Example 7 (Bad Result)
Input: test test,,,
Result: test test,,,
Example 8
Input: test test!! test,,,
Result: test test! test,
Example 9
Input: ...test test,,,
Result: .test test,
Example 10
Input: test! test. test? test,,,
Result: test! test. test? test,
I'm at a loss here as to why, when there are only commas, that it doesn't remove the extras. I tried switching the order that each gets replaced, but that didn't change anything.
-
Robert Plank
- Forum Contributor
- Posts: 110
- Joined: Sun Dec 26, 2004 9:04 pm
- Contact:
They all worked for me.
My output:
Code: Select all
<?php
function stripPunctuation($string) {
return preg_replace("/([!?,.])+/", "\\1", $string);
}
$input = array(
'test,,,',
'test...',
'test!!!',
'test???',
'test,,, test???',
'test,,, test,,,',
'test test,,,',
'test test!! test,,,',
'...test test,,,',
'test! test. test? test,,,'
);
$output = array_map("stripPunctuation", $input);
echo "<xmp>";
print_r($output);
echo "</xmp>";
?>Code: Select all
Array
(
[0] => test,
[1] => test.
[2] => test!
[3] => test?
[4] => test, test?
[5] => test, test,
[6] => test test,
[7] => test test! test,
[8] => .test test,
[9] => test! test. test? test,
)seems to work for me:
And on php4 also:
Code: Select all
$ php -r 'echo preg_replace("/([!?,.])+/", "\\1", "test,,, test,,,");'
test, test,
$ php -r 'echo preg_replace("/([!?,.])+/", "\\1", "test test,,,");'
test test,
$ php -v
PHP 5.1.2 (cli) (built: Jan 11 2006 16:40:00)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend TechnologiesCode: Select all
$ php -r 'echo preg_replace("/([!?,.])+/", "\\1", "test,,, test,,,");'
test, test,
$ php -r 'echo preg_replace("/([!?,.])+/", "\\1", "test test,,,");'
test test,
$ php -v
PHP 4.4.2-pl2-gentoo (cli) (built: Jun 15 2006 04:45:06)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v1.3.0, Copyright (c) 1998-2004 Zend TechnologiesThis might work but it is wrong. \1 should be used for back references only. In this context you should be using $1.Robert Plank wrote:Code: Select all
"\\1"
Pretty strong wording, considering the manual shows examples using this style and it is more cross-language compatable.bokehman wrote:This might work but it is wrong.Robert Plank wrote:Code: Select all
"\\1"
The cautionary statement at the begining of the page is for more than 9 captures, when the $number variable syntax is preferable.
My prefered windows editor TextPad does:bokehman wrote:That's not true. For example Apache's regex engine does not allow it.sweatje wrote:it is more cross-language compatable.
as does sedtextpad help wrote:\0 to \9 Substitute the text matching tagged expression 0 through 9.
as does vimsed help wrote:\1 \2 ...\9 backreference, matches i-th memorized \(..\)
Many regex engines implement the concept of numeric backreferences. Not all implementations have variables, let alone automatically bind variables to the grouping results.vim docs wrote:3.5 Grouping and Backreferences
You can group parts of the pattern expression enclosing them with "\(" and "\)" and refer to them inside the replacement pattern by their special number \1, \2 ... \9.
You are confusing back references with replacements. I said \1 should only be used for backreferences. I didn't say it shouldn't be used for back references. The context to which I was refering was replacement which is very different. Apache uses \1 for back references and $1 for replacement. Perl too!sweatje wrote:Many regex engines implement the concept of numeric backreferences.