Page 1 of 1

Regular expression (huh?)

Posted: Thu Oct 13, 2005 7:49 pm
by Bill H
The term "regular expression" has always rather bemused me, as I see nothing regular about it at all. The whole concept is something that I have simply never been able to get my head wrapped around at all. Fortunately, one can write code without it. Maybe the code would be better with it, but....

Anyway, I am adapting a product that seems to require a preg_match and I'm not sure why. It uses this javascript:

Code: Select all

<script language="JavaScript">
<!--
function WriteBack(form_name,field_name,text)
{
     opener.document[form_name][field_name].value = text;
}
-->
</script>
which doesn't work unless the text is first translated with:

Code: Select all

$str = preg_replace("/('|\"|\\\\|\r|\n)/", "\\\\\$1", $str);
But that removes all of the newlines from the text, which is a bummer. Since the whole function could be written in sanskrit, I have no idea how to deal with the situation.

In the first parameter the forward slashes are delimiters, I got that much, but what is all the rest? And what is the replacement string? Is $1 some kind of constant? Because I see it nowhere in the code. Will the js not work if there are newlines?

I will take all the help I can get, here.

Posted: Thu Oct 13, 2005 8:18 pm
by Burrito
Moved to RegEx.

Posted: Thu Oct 13, 2005 9:57 pm
by programmermatt
http://www.regular-expressions.info/
http://us2.php.net/manual/en/reference. ... syntax.php

Code: Select all

preg_replace("/('|\"|\\\\|\r|\n)/", "\\\\\$1", $str);
(..) - means capture text in this syntax
' is matching for that character
\" is matching for a " (as " has special meaning elsewhere, \" is used. \ is an escape character)
| - OR
\\ is the same as \ because \ is an escape special-character
\r is a carriage return on some OSes
\n is a rarriage return on others

They really should include a match for \r\n as some OSes use that.

So, it is looking for all matches in $str that are a ' or " or \\ or newline and replacing it with \\[Value capture]

Posted: Fri Oct 14, 2005 9:09 am
by Bill H
The second link is one that I had already read. It's written in some language that looks like English but isn't.

The first like appears to be quite a jewel. I haven't had time yet to dig into it, but I think it is actually going to lead to me having some understanding of regular expressions. Thanks.

I knew that "|" was OR, but wasn't sure that applied in this setting. I knew about the escape, and I think I actually sort of vaguely understood the search part. Sort of.
...replacing it with \\[Value capture]
I beg your pardon?

(When you get into regex people sometimes start out in English, but they usually devolve back into that regex language that looks like English but isn't.)

Posted: Fri Oct 14, 2005 9:32 am
by Bill H
Upon further testing of the expression. It is not removing ' or " or \ from the strings, but it is removing newlines.

I'm not certain whether the newlines are \n or \r\n, but they are created in an HTML textarea by hitting the <enter> key, and the server is Unix. My computer is Win98se.

I still haven't been able to find out what $1 is.

Posted: Fri Oct 14, 2005 10:01 am
by John Cartwright
Try googling for Regex Coach.. it is a fantastic program that tells you exactly what your expression is doing

Posted: Fri Oct 14, 2005 10:08 am
by foobar
Jcart wrote:Try googling for Regex Coach.. it is a fantastic program that tells you exactly what your expression is doing
I recommend this piece of software as well, I use it all the time. Regex's are awesome, they are very powerful, but a pain in the backside to use at times. Regex Coach tells you what exactly your regex is doing, and an abundance of other useful info.

Posted: Fri Oct 14, 2005 2:49 pm
by Bill H
Thanks, I will try that.