Page 1 of 1

Auto-escape ampersands[solved]

Posted: Thu May 20, 2004 3:28 am
by vigge89
I'm looking for an RegExp-pattern which converts all ampersands ( & ) to its system identifier " & ". The reason i want to use RegExp is becuase if using normal Str_Replace() to replace all & with & , other system-identifiers which starts with & gets invalid, for exmaple:
If i replace & with & in this string: " sometext   someothertext ",
it turns into: " sometext   someothertext "

So i'm looking for e regexp pattern which finds all & which is alone, and haven't got an ; after them. Are there any Regexp gurus out there who can solve this?

Posted: Thu May 20, 2004 3:36 am
by feyd

Code: Select all

$string = "this   is a test & show of regex";
preg_replace("/(\W)&(\W)/","\\1&\\2",$string)
this   is a test & show of regex

Posted: Thu May 20, 2004 3:39 am
by vigge89
great, thanks! :D

Edit: after trying this:

Code: Select all

<?php
echo preg_replace ("/(\W)\&(\W)/","\\1&\\2", "<a href='http://domain.com/index.php?pagetype=2&article=30262&sectionid=1585' target='_blank'>");
?>
i get an blank output, any ideas?

Posted: Thu May 20, 2004 5:06 am
by dave420
Try this:

Code: Select all

preg_replace("/&(?![A-Za-z0-9#]{3,7};)/", "&", $text);
It uses a negative look-ahead assertion (the (?!...) part), which says "replace any ampersands not followed by between 3 and 7 letters/numbers/#-signs and a semi-colon. That means it'll match & signs, but not & signs in HTML entities (you can fiddle with the 3,7 limits, but the largest entity I found had 6 chars, so with the hash, is 7, so it shoudl cover almost anything).

I tried it with a string full of entities, and even ampersands within words, and it worked fine. That's no guarantee of course, as I might have missed something blatantly obvious :)

Posted: Thu May 20, 2004 5:44 am
by vigge89
great, it works now, thanks! :D