Page 1 of 1
replace <p> by \\n\\n
Posted: Mon May 10, 2010 3:18 am
by lclqt12
Hi all,
I have a text : str = "abc<p> cde</p>, <p>fasd";
The expect result is : "abc\n\n cde, \n\nfasd"
How could i do that ? ( using regular expression )
Re: replace <p> by \\n\\n
Posted: Mon May 10, 2010 3:25 am
by garygay
lclqt12 wrote:Hi all,
I have a text : str = "abc<p> cde</p>, <p>fasd";
The expect result is : "abc\n\n cde, \n\nfasd"
How could i do that ? ( using regular expression )
hi,
$str = str_replace('/<p>/','\n\n',$str);
str_replace('/<\/p>/','',$str)
Re: replace <p> by \\n\\n
Posted: Mon May 10, 2010 4:19 am
by lclqt12
Thank you for your support.
But as i said, in my text, <p> is not only <p>. It could be <p font ....> or <p margin ... >
Therefore, i wounder how could i replace all <p ....> with \n\n
Re: replace <p> by \\n\\n
Posted: Mon May 10, 2010 9:20 am
by AbraCadaver
Code: Select all
$str = preg_replace('#</?p[^>]*>#', "\n\n", $str);
Re: replace <p> by \\n\\n
Posted: Mon May 10, 2010 9:44 am
by ridgerunner
AbraCadaver, your regex has a problem: It replaces the end of paragraph tag </p> with \n\n (only the opening <p> should have this substitution). Also, you should probably specify the "i" ignore-case modifier.
The opening and closing tags need to be handled separately like so:
Code: Select all
// replace <p ...> opening tags with double linefeeds
$str = preg_replace('/<p[^>]*+>/i', '\n\n', $str);
// strip </p> closing tags
$str = str_replace('</p>', '', $str);
If you're curious, here is a more complex (but commented) regex which does it in one step:
Code: Select all
$str = preg_replace('%
<p[^>]*> # match an opening paragraph tag with attributes
( # begin group 1 to capture paragraph contents
[^<]*+ # consume everything up to next < tag start char
(?: # begin unrolling the loop...
(?!</?p\b) # at a position that is not a paragraph tag
< # match the beginning of a non-p tag
[^<]*+ # consume everything up to next < tag start char
)*+ # repeat the loop as many times as required
) # end group 1 capturing paragraph contents
(?=</?p\b|$) # stop matching on <p*>, </p> or end of string
(?:</p>)? # if there is a closing </p> match and discard it
%ix', '\n\n$1', $str);
Note that neither of these solutions get rid of other embedded HTML tags that may be between the <p> tags.
For example: "abc<p> <em>cde</em></p>, <p>fasd"
Re: replace <p> by \\n\\n
Posted: Mon May 10, 2010 10:01 am
by AbraCadaver
Ahh yes, I misread the original post and thought to replace the closing tag also. I didn't notice the second opening tag.