replace <p> by \\n\\n

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
lclqt12
Forum Newbie
Posts: 3
Joined: Thu May 06, 2010 11:25 pm

replace <p> by \\n\\n

Post by lclqt12 »

Hi all,

I have a text : str = "abc<p> cde</p>, <p>fasd";
The expect result is : "abc\n\n cde, \n\nfasd"

How could i do that ? ( using regular expression )
User avatar
garygay
Forum Newbie
Posts: 5
Joined: Thu Mar 04, 2010 4:03 am

Re: replace <p> by \\n\\n

Post by garygay »

lclqt12 wrote:Hi all,

I have a text : str = "abc<p> cde</p>, <p>fasd";
The expect result is : "abc\n\n cde, \n\nfasd"

How could i do that ? ( using regular expression )
hi,

$str = str_replace('/<p>/','\n\n',$str);
str_replace('/<\/p>/','',$str)
lclqt12
Forum Newbie
Posts: 3
Joined: Thu May 06, 2010 11:25 pm

Re: replace <p> by \\n\\n

Post by lclqt12 »

Thank you for your support.
But as i said, in my text, <p> is not only <p>. It could be <p font ....> or <p margin ... >
Therefore, i wounder how could i replace all <p ....> with \n\n
User avatar
AbraCadaver
DevNet Master
Posts: 2572
Joined: Mon Feb 24, 2003 10:12 am
Location: The Republic of Texas
Contact:

Re: replace <p> by \\n\\n

Post by AbraCadaver »

Code: Select all

$str = preg_replace('#</?p[^>]*>#', "\n\n", $str);
mysql_function(): WARNING: This extension is deprecated as of PHP 5.5.0, and will be removed in the future. Instead, the MySQLi or PDO_MySQLextension should be used. See also MySQL: choosing an API guide and related FAQ for more information.
User avatar
ridgerunner
Forum Contributor
Posts: 214
Joined: Sun Jul 05, 2009 10:39 pm
Location: SLC, UT

Re: replace <p> by \\n\\n

Post by ridgerunner »

AbraCadaver, your regex has a problem: It replaces the end of paragraph tag </p> with \n\n (only the opening <p> should have this substitution). Also, you should probably specify the "i" ignore-case modifier.

The opening and closing tags need to be handled separately like so:

Code: Select all

// replace <p ...> opening tags with double linefeeds
$str = preg_replace('/<p[^>]*+>/i', '\n\n', $str);
// strip </p> closing tags
$str = str_replace('</p>', '', $str);
If you're curious, here is a more complex (but commented) regex which does it in one step:

Code: Select all

$str = preg_replace('%
    <p[^>]*>        # match an opening paragraph tag with attributes
    (               # begin group 1 to capture paragraph contents
      [^<]*+        # consume everything up to next < tag start char
      (?:           # begin unrolling the loop...
        (?!</?p\b)  # at a position that is not a paragraph tag
        <           # match the beginning of a non-p tag
        [^<]*+      # consume everything up to next < tag start char
      )*+           # repeat the loop as many times as required
    )               # end group 1 capturing paragraph contents
    (?=</?p\b|$)    # stop matching on <p*>, </p> or end of string
    (?:</p>)?       # if there is a closing </p> match and discard it
    %ix', '\n\n$1', $str);
Note that neither of these solutions get rid of other embedded HTML tags that may be between the <p> tags.
For example: "abc<p> <em>cde</em></p>, <p>fasd"
User avatar
AbraCadaver
DevNet Master
Posts: 2572
Joined: Mon Feb 24, 2003 10:12 am
Location: The Republic of Texas
Contact:

Re: replace <p> by \\n\\n

Post by AbraCadaver »

Ahh yes, I misread the original post and thought to replace the closing tag also. I didn't notice the second opening tag.
mysql_function(): WARNING: This extension is deprecated as of PHP 5.5.0, and will be removed in the future. Instead, the MySQLi or PDO_MySQLextension should be used. See also MySQL: choosing an API guide and related FAQ for more information.
Post Reply