Need help w/ regular expressions
Posted: Sat Feb 21, 2009 7:56 am
Hi,
I'm having problems understanding how to create regular expressions. I need to convert a portion of a newsletter that is retrieved from a MySQL table to plain text for a mailing. This example is for one item in this portion of the newsletter:
<span style="font-weight: bold;">New: Must See </span>
<br style="font-weight: bold;" /><span style="font-weight: bold;">DVD Movie</span><br style="font-weight: bold;" />
<a style="font-weight: bold;" href="http://www.somesite.com/view_trailer.asp" target="_self">View Movie Trailer</a>
<br style="font-weight: bold;" /><a style="font-weight: bold;" href="http://www.somesite.php" target="_self">Purchase for Just $15</a>
Most of the time the items contained in this portion of the newsletter are similar to the formatting used in the example.
I need to be able to keep each item's text and links together so that the plain text will look something like:
New: Must See DVD Movie
View Movie Trailer http://www.somesite.com/view_trailer.php
Purchase for Just $15 http://www.somesite.php
I found a regular expression for the links (which works but I don't understand it):
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
which returns the following in an array but, by itself it's not very usable as the only text I can retrieve is $matches[3] & I really need to get the text from the <span> tags too ;
[2] => Array
(
[0] => <a style="font-weight: bold;" href="http://www.somesite.com/view_trailer.asp" target="_self">View Movie Trailer</a>
[1] => "
[2] => http://www.somesite.com/view_trailer.php
[3] => View Movie Trailer
)
[3] => Array
(
[0] => <a style="font-weight: bold;" href="http://www.somesite.com/products/dvd_new.php" target="_self">Purchase for Just $15</a>
[1] => "
[2] => http://www.isomesite/dvd_new.php
[3] => Purchase for Just $15
Is this even doable with regular expressions, or is there a better way?
Any and all help will be greatly appreciated,
Joseph
I'm having problems understanding how to create regular expressions. I need to convert a portion of a newsletter that is retrieved from a MySQL table to plain text for a mailing. This example is for one item in this portion of the newsletter:
<span style="font-weight: bold;">New: Must See </span>
<br style="font-weight: bold;" /><span style="font-weight: bold;">DVD Movie</span><br style="font-weight: bold;" />
<a style="font-weight: bold;" href="http://www.somesite.com/view_trailer.asp" target="_self">View Movie Trailer</a>
<br style="font-weight: bold;" /><a style="font-weight: bold;" href="http://www.somesite.php" target="_self">Purchase for Just $15</a>
Most of the time the items contained in this portion of the newsletter are similar to the formatting used in the example.
I need to be able to keep each item's text and links together so that the plain text will look something like:
New: Must See DVD Movie
View Movie Trailer http://www.somesite.com/view_trailer.php
Purchase for Just $15 http://www.somesite.php
I found a regular expression for the links (which works but I don't understand it):
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
which returns the following in an array but, by itself it's not very usable as the only text I can retrieve is $matches[3] & I really need to get the text from the <span> tags too ;
[2] => Array
(
[0] => <a style="font-weight: bold;" href="http://www.somesite.com/view_trailer.asp" target="_self">View Movie Trailer</a>
[1] => "
[2] => http://www.somesite.com/view_trailer.php
[3] => View Movie Trailer
)
[3] => Array
(
[0] => <a style="font-weight: bold;" href="http://www.somesite.com/products/dvd_new.php" target="_self">Purchase for Just $15</a>
[1] => "
[2] => http://www.isomesite/dvd_new.php
[3] => Purchase for Just $15
Is this even doable with regular expressions, or is there a better way?
Any and all help will be greatly appreciated,
Joseph