finding a right reg_exp for preg_replace() function

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
Maxx
Forum Newbie
Posts: 1
Joined: Wed Aug 15, 2007 9:50 pm

finding a right reg_exp for preg_replace() function

Post by Maxx »

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


I have an article database that each article has a format like this (for example only)

[syntax="html"]<div><p>Worldwide statistics show that colon cancer kills more than 600,000 people every year.  Now, a new study confirms that diet can play a powerful role in treatment.  VOA's Melinda Smith has details.</p>
<p>Fifty-five year old John Coughlin looks like the picture of health.  And he thought he was -- until a routine colonoscopy revealed he was in stage three of colon cancer.</p>
<p>Stage three means that tumor cells have spread to other organs and lymph nodes near the colon. "I went through six weeks of concurrent radiation and chemotherapy.  In December of that year I had major surgery to remove the lower part of my colon and that was followed by six months of weekly chemotherapy.</p>
<p>Someone Writethis</p></div>
its start with <div> tag and separate each paragraph with <p> tag
the last <p> tag content the Author of the article
I want to replace <p>Someone Writethis</p> with <p>Author: Someone Writethis</p>

with my little knowledge, I program my code like this[/syntax]

Code: Select all

$body=preg_replace('/<p>(.*?)<\/p><\/div>/','<p>Author: $1</p></div>',$body);
then the result is not match what I want

Code: Select all

<div><p>Author: Worldwide statistics show that colon cancer kills more than 600,000 people every year.  Now, a new study confirms that diet can play a powerful role in treatment.  VOA's Melinda Smith has details.</p>
<p>Fifty-five year old John Coughlin looks like the picture of health.  And he thought he was -- until a routine colonoscopy revealed he was in stage three of colon cancer.</p>
<p>Stage three means that tumor cells have spread to other organs and lymph nodes near the colon. "I went through six weeks of concurrent radiation and chemotherapy.  In December of that year I had major surgery to remove the lower part of my colon and that was followed by six months of weekly chemotherapy.</p>
<p>Someone Writethis</p></div>
it add Author to the first <p> tag

How to write a correct regular expression for this case


feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]

[quote="[url=http://forums.devnetwork.net/viewtopic.php?t=30037]Forum Rules[/url] Section 1.1"][b]1.[/b] Select the correct board for your query. Take some time to read the guidelines in the sticky topic.[/quote]
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

You don't need a regex if that's all you want to do.

Code: Select all

$body = str_replace('Someone Writethis', 'Author: Someone Writethis', $body);
or, if someone writethis is changed with every entry, then you would need a regex.

Code: Select all

preg_match_all("#<p>(.+?)</p></div>", $body, $matches);

//your results will be in $matches, ready for you to loop through or use preg_replace_callback()
echo '<pre>';
print_r($matches);
echo '</pre>';
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
csh
Forum Newbie
Posts: 3
Joined: Mon Sep 10, 2007 12:25 pm

Post by csh »

The reason it's changing the first one is because of the greedy/non-greedy problem. I think if you change your code to this:

$body=preg_replace('/<p>(.*?)<\/p><\/div>/U','<p>Author: $1</p></div>',$body);

The U argument should fix the problem.
Post Reply