Page 1 of 1

Replace quotes with curly quotes

Posted: Wed Jun 06, 2007 5:24 am
by someberry
I am trying to replace quotes with nice curly quotes, however, I have run into a slight hiccup.

I am parsing whole articles of text which contain HTML; this is where the problem lies. I want to replace quotes, but leave quotes in tags alone. If the following text is what I was parsing:

Code: Select all

<span class="strong">"Hello"</span> he said, "My name is someberry.".
I would only want to match:

Code: Select all

"Hello"
"My name is someberry."
I tried using the expression:

Code: Select all

'#(?<!=)"(.*?)"(?!>)#i'
However it seems to ignore the last negative lookahead.

Anyone have any ideas why, or a better expression I could use as an alternative? :-)

P.S. I know this current expression as it is isn't fullproof, however it doesn't need to be.

Posted: Wed Jun 06, 2007 6:23 am
by stereofrog
For simple cases the following would be sufficient

Code: Select all

$re = '~ " [^"<>]+ " (?= [^<>]* ($ | <) )~x';
however, I don't think regexp is a right tool for processing html. Consider using a parser, e.g. HTML_SAX.

Posted: Wed Jun 06, 2007 6:44 am
by someberry
stereofrog wrote:For simple cases the following would be sufficient

Code: Select all

$re = '~ " [^"<>]+ " (?= [^<>]* ($ | <) )~x';
however, I don't think regexp is a right tool for processing html. Consider using a parser, e.g. HTML_SAX.
Hmm, I just tried it, and it is not exactly what I was looking for. I would like to do this:

Code: Select all

preg_replace(
            '#(?<!=)"(.*?)"(?!>)#i',
            '“\\1”',
            $code);
Edit: made quite a big typo, added word in bold was what I meant to say.

Posted: Wed Jun 06, 2007 7:43 am
by feyd
Make sure to use HTML entities, not the character equivalents.

Posted: Wed Jun 06, 2007 7:54 am
by someberry
feyd wrote:Make sure to use HTML entities, not the character equivalents.
Of course, however phpBB parses them into plain text characters it would seem. :roll:

Posted: Wed Jun 06, 2007 7:57 am
by feyd
Only if you use the numeric form. The named for is what I was referring to.

&ldquo; and &rdquo;

Posted: Wed Jun 06, 2007 8:05 am
by someberry
feyd wrote:Only if you use the numeric form. The named for is what I was referring to.

&ldquo; and &rdquo;
Either way, the OP's question is still unanswered :)

Posted: Wed Jun 06, 2007 8:24 am
by stereofrog
I'd suggest OP reads the answers he's got and tries to adapt them on his own, if OP waits for the complete code, from my side this is not going to happen. ;)