Page 1 of 1
Find <img> tags in a string. [SOLVED]
Posted: Tue May 03, 2005 7:00 pm
by Stryks
This probrably isnt as straightforward as the subject, but I'm really not sure how else to describe it.
I have an email generating script which allows you to insert images in standard HTML format. I want to embed the images however so that the user doesnt get the images blocked warning, and to do this I need to covert the <img> tags.
So, lets say I insert an image using:
Code: Select all
<img src='http://www.mysite.com/email/full_logo.gif'>
I want that to be switched out for:
Code: Select all
<img alt="e;Mailer"e; src="e;cid:attach-1"e;>
Then I need to pull out the image src so that I can explode it and take the filename to specify the images actual location on the server.
This would need to loop through all instances of an <img> tag, and the attach number will need to increment each time.
Any tips on how to achieve this? Cheers
Posted: Tue May 03, 2005 8:20 pm
by Skara
Code: Select all
$foo = preg_replace("/<img src='\S'>/i", '<img alt="Mailer" src="cid:attach-1">', $original);
that should work.

Posted: Tue May 03, 2005 8:32 pm
by Stryks
Regex just kills me. One day I'm going to have to sit down and figure out either how it works, or why the sadists who made it wanted to torture me so badly.
I can see what you are doing though, and it should work OK for the actual replace, but I have alot of work to do before that I think.
Firstly, I need to get a list of the image tags that were presented.
Code: Select all
$string = "Here <img src='http://www.mysite.com/email/full_logo.gif'> there <img src='http://www.mysite.com/email/full_logo.gif'> everywhere";
$result = preg_match_all("/<img(.*?)>/", $string, $matches);
if ($result) {
for($i=0; $i<count($matches); $i++) {
echo $matches[0][$i] . " (image $i)<br>";
}
} else {
echo "not found";
}
The above seems to return that information. So, now that I have a list of the tags, I should be able to replace them with the new info as per your example.
Then I just need to break down the results and embed the files found.
Will post back when I have made up the replace routine.
Posted: Tue May 03, 2005 8:39 pm
by Skara
ah, REs are a bugger at first, but are pretty fun once you learn how to use them.

Posted: Tue May 03, 2005 8:49 pm
by Stryks
Well ... ok ... forgive me if I've over simplified matters here, but this seems to work (although it does odd things if you view it in the browser)
Code: Select all
$result = preg_match_all("/<img(.*?)>/", $string, $matches);
if ($result) {
for($i=0; $i<count($matches); $i++) {
$string = preg_replace("\"" . $matches[0][$i] . "\"", "<img alt='Mailer' src='cid:attach-$i'>", $string);
}
}
echo $string;
As I said, it seems to work, but honestly, I dont really know what I'm doing.
Now all I want to do is grab the src info and explode it. My problem here is that I wont know if the user has used double or single quotes. Any tips on how to grab the src on either option?
Cheers
Posted: Tue May 03, 2005 8:52 pm
by John Cartwright
upon insertion to the database or whatever you are storing your string, replace all single quotes to quote quotes. Keep a consistancy.
Posted: Tue May 03, 2005 9:05 pm
by Skara
for single vs double quotes, simply replace all single quotes with double first, or vice versa.
Code: Select all
$string = str_replace('"',"'",$string);
you goofed your code though, as you simply replace $string each time.
either:
Code: Select all
$result = preg_match_all("/<img(.*?)>/", $string, $matches);
if ($result) {
for($i=0; $i<count($matches); $i++) {
$string = preg_replace("\"" . $matches[0][$i] . "\"", "<img alt='Mailer' src='cid:attach-$i'>", $string);
echo $string;
}
}
or:
Code: Select all
$result = preg_match_all("/<img(.*?)>/", $string, $matches);
if ($result) {
for($i=0; $i<count($matches); $i++) {
$string[$i] = preg_replace("\"" . $matches[0][$i] . "\"", "<img alt='Mailer' src='cid:attach-$i'>", $string);
}
}
foreach ($string as $one) {
echo $one;
}
now then, let's see...
First, you'll want the single to double, then you'll want the src info... If you knew they weren't going to use an alt param, you could simply do this:
Code: Select all
$result = preg_match_all("/<img src=['\"](\S+?)['\"][ ]?[\/]?>/i", $string, $matches);
// I added the "[ ]?[\/]?" in for the xhtml img tags. (hope I got that right.)
// in other words, "src=''>", "src='' />", "src='' >", or "src=''/>"
// The brackets are just for readability
// Since we're doing the src here, the ['\"] checks for either ' or "
// the \S is for non-whitespace characters
// the + means there has to be at least one character there
// the i at the end makes it case insensitive
if ($result) {
for($i=0; $i<count($matches); $i++) {
$string = "<img alt='Mailer' src='cid:attach-{$i}'>";
echo $string;
}
}
Ok.. I really don't know why you have a preg_replace in the for loop. Besides that, you've got it wrong. perl REs need to be enclosed in / or #. In other words, a valid version of your pRE would be:
Code: Select all
$string = preg_replace("/\"{$matches[0][$i]}\"/i", "text", $string);
But all that RE does it replace the one text with the second. And the second contains nothing of the first. Therefore, it does nothing. ^^;
Posted: Tue May 03, 2005 9:24 pm
by Stryks
I'm not really sure by what you mean by some of your previous post.
Firstly, the code I posted doesn't actually make two different version of $string, but modifies the two different sections of the code. If you run it you should get something like the following in the browser.
Code: Select all
Here <img alt='Mailer' src='cid:attach-0'> there <img alt='Mailer' src='cid:attach-1'> everywhere
This is pretty much what I was hoping to achieve, as the new img tags are really just pointers to the embedded images.
Echoing the $string each pass gives something more like:
Code: Select all
Here <img alt='Mailer' src='cid:attach-0'> there <img src='http://www.mysite.com/email/full_logo.gif'> everywhereHere <img alt='Mailer' src='cid:attach-0'> there <img alt='Mailer' src='cid:attach-1'> everywhere
But yeah, if I have that string in order, the output is handled. Now I just need to embed the images, so I would loop through the $matches and for each, I want to pull the information between src=" and the closing ". If I can grab this then I can explode on / and grab the last one to get the filename. From there I can just tack it onto the filesystem location for that file and hand it across to the function which embeds the image.
Does this make sense?
So basically, now that I'm replacing the single quotes for doubles, what would I use to preg_match the content of each src value?
Thanks for the help this far. It's really appreciated.
Posted: Tue May 03, 2005 9:35 pm
by Stryks
I was clear as mud on that one ... Let me try again.
The image embedder can only embed files that are local, not at a web URL. So I want to take each file which is linked, take the filename, and then rewrite the location of the file to point to the server location for that file.
The images which are linked are restricted to a specific location, so all I need the filename as I know the file exists if it is linked. The purpose of the URL style of calling the image is so that I can easily give a web preview of html based emails before sending.
So, I've managed to get a list of the images linked in the $string. These are stored in $matches.
So, for each $matches, I want to replace the original <img> tag with another one which is just a placeholder to the original, uniquely named with its array positon in $matches ($i).
Then, I can go through and embed the images once I have manipulated the src address, tagging them as they are embedded with the matching unique name given to the replaced URL.
This way, the images are embedded while replacing the <img> tags in the $string, and then $string is sent to the mail body, and the embedded images are linked.
I hope this helps explain why I'm appearing to lose the original <img> tags, as well as my various other weird approaches.
Posted: Tue May 03, 2005 9:44 pm
by Skara
oooh, oh oh. I completely missed the first $string in preg_match_all(). XD
Yeah, ok. So you'd need something like...
Code: Select all
$result = preg_match_all("/<img src=['\"](\S+?)['\"][ ]?[\/]>/i", $string, $matches);
if ($result) {
for($i=0; $i<count($matches); $i++) {
$string = "<img alt='Mailer' src='cid:attach-{$i}'>";
}
}
(If that first RE works,) That should be all you need. If you expect alt params, then this (might) work:
Code: Select all
"e;/<img (?=alt=ї'\"e;]\S+?ї'\"e;] )?src=ї'\"e;](\S+?)ї'\"e;](?= alt=ї'\"e;]\S+?ї'\"e;])?ї ]?ї\/]>/i"e;
Posted: Wed May 04, 2005 2:05 am
by Stryks
I dont know if I'm just doing a really bad job of explaining this or if I'm just missing the point, but you seem to be doing the opposite of what I am trying to do.
That last snippet appears to grab the links out of the original body and convert it to the header I would like to insert.
What I wanted to do was to change the link in the original to the new header and keep a copy of the original image call so that I can process it separately.
So, I start out with a body text along the lines of:
Code: Select all
$string = 'Hello <font size=\'4\'>there</font>, <p>';
$string .= '<i>Your</i> personal photograph to this message - sent to %email%.<p>';
$string .= '<img src="http://www.mysite.com/email/full_logo.gif">';
$string .= "Sincerely, <br>";
$string .= "manager";
Then I find the image tags:
Code: Select all
$result = preg_match_all("/<img(.*?)>/", $string, $matches);
if ($result) {
//take next action
}
Now I want to find the exact strings I found before and replace them with the new headers. If I could do that then I could get the original URL's and then pass them through the embedder.
But for some reason, I cant even get it to find any more than 2 image tags in a longer string.
Optimally I would run a whole webpage through this, but it seems to stop after a certain amount of characters. Is there some kind of input limit I dont know about here?
This is INSANE
Posted: Wed May 04, 2005 7:08 am
by Stryks
It seems that I cant do even something which for all intensive purposes should be quite easy.
Consider the following.
Code: Select all
$body = '
<table width="560" border="0" align="center" cellpadding="0" cellspacing="0" class="thickborder">
<tr>
<td colspan="3"><img src="http://www.absolutelygorgeous.com.au/email/full_logo.gif"></td>
</tr>
<tr>
<td colspan="3"><img src="http://www.absolutelygorgeous.com.au/email/pixel_trans.gif" width="1" height="35" border="0"></td>
</tr>
<tr>
<td width="40"><img src="http://www.absolutelygorgeous.com.au/email/pixel_trans.gif" width="40" height="1" border="0"></td>
<td><span class="borderedImage"></span><img src="http://www.absolutelygorgeous.com.au/link_images/mum.gif" width="500" height="228"></td>
<td width="40"><img src="http://www.absolutelygorgeous.com.au/email/pixel_trans.gif" width="40" height="1" border="0"></td>
</tr>
<tr>
<td colspan="3"><img src="http://www.absolutelygorgeous.com.au/email/pixel_trans.gif" width="1" height="20" border="0"></td>
</tr>
<tr>
<td></td>
<td class="stdText">
<blockquote>
<p>
Spoil your mum this mothers day as we don\'t mum really wants a blender or a vacuum cleaner for Mother\'s Day.<br><br>
What she really wants is something pretty, something special, something luxurious - something she wouldn\'t buy for herself. So show your mum how much she means to you, with a gift from Adore Beauty and Accessories.<br><br>
There are plenty of fabulous ideas in the gift guide - for under $30, under $50, or pitch in with some siblings and get her this fabulous Spencer and Rutherford bag. Then she won\'t mind so much when you borrow it.
</p>
<p>
<img border="0" src="http://www.absolutelygorgeous.com.au/catalog/images/Raunchy-Rosie.jpg" width="105" height="82">
<img border="0" src="http://www.absolutelygorgeous.com.au/catalog/images/Kinky-Kate.jpg" width="126" height="127">
<img border="0" src="http://www.absolutelygorgeous.com.au/catalog/images/Molly-Minx.jpg" width="87" height="85">
</p>
</blockquote>
</td>
<td></td>
</tr>
</table>
';
$result = preg_match_all("/<img(.*?)>/si", $body, $matches);
if ($result) {
for($i=0; $i<count($matches); $i++) {
$path_result = preg_match("/src=['\"](\S+?)['\"]/i", $matches[0][$i], $full_path);
$URL = explode("/", $full_path[1]);
echo $URL[count($URL)-1] . "<br>";
}
}
I set $body to hold a clip out of a page, and I run that through the above routine. For the first two images, it outputs as expected. Then it just skips the rest.
I have been looking and testing for hours and I just can't seem to get it to find any more than 2.
It's important that I dont search for '<img src' as I dont really have any control over what order the user enters the parameters. In fact, under certain circumstances I need them to carry through, but thats not really the problem just now. To begin with I'd be stoked just to get it to find all of the image tags and not just the first two.
Any help would be great. Cheers.
Posted: Wed May 04, 2005 1:24 pm
by Skara
when in doubt, print_r($matches);
Which gives:
Code: Select all
Array
(
ї0] =&gt; Array
(
ї0] =&gt; &lt;img src="e;http://www.absolutelygorgeous.com.au/email/full_logo.gif"e;&gt;
ї1] =&gt; &lt;img src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;1"e; height="e;35"e; border="e;0"e;&gt;
ї2] =&gt; &lt;img src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;40"e; height="e;1"e; border="e;0"e;&gt;
ї3] =&gt; &lt;img src="e;http://www.absolutelygorgeous.com.au/link_images/mum.gif"e; width="e;500"e; height="e;228"e;&gt;
ї4] =&gt; &lt;img src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;40"e; height="e;1"e; border="e;0"e;&gt;
ї5] =&gt; &lt;img src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;1"e; height="e;20"e; border="e;0"e;&gt;
ї6] =&gt; &lt;img border="e;0"e; src="e;http://www.absolutelygorgeous.com.au/catalog/images/Raunchy-Rosie.jpg"e; width="e;105"e; height="e;82"e;&gt;
ї7] =&gt; &lt;img border="e;0"e; src="e;http://www.absolutelygorgeous.com.au/catalog/images/Kinky-Kate.jpg"e; width="e;126"e; height="e;127"e;&gt;
ї8] =&gt; &lt;img border="e;0"e; src="e;http://www.absolutelygorgeous.com.au/catalog/images/Molly-Minx.jpg"e; width="e;87"e; height="e;85"e;&gt;
)
ї1] =&gt; Array
(
ї0] =&gt; src="e;http://www.absolutelygorgeous.com.au/email/full_logo.gif"e;
ї1] =&gt; src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;1"e; height="e;35"e; border="e;0"e;
ї2] =&gt; src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;40"e; height="e;1"e; border="e;0"e;
ї3] =&gt; src="e;http://www.absolutelygorgeous.com.au/link_images/mum.gif"e; width="e;500"e; height="e;228"e;
ї4] =&gt; src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;40"e; height="e;1"e; border="e;0"e;
ї5] =&gt; src="e;http://www.absolutelygorgeous.com.au/email/pixel_trans.gif"e; width="e;1"e; height="e;20"e; border="e;0"e;
ї6] =&gt; border="e;0"e; src="e;http://www.absolutelygorgeous.com.au/catalog/images/Raunchy-Rosie.jpg"e; width="e;105"e; height="e;82"e;
ї7] =&gt; border="e;0"e; src="e;http://www.absolutelygorgeous.com.au/catalog/images/Kinky-Kate.jpg"e; width="e;126"e; height="e;127"e;
ї8] =&gt; border="e;0"e; src="e;http://www.absolutelygorgeous.com.au/catalog/images/Molly-Minx.jpg"e; width="e;87"e; height="e;85"e;
)
)
You're counting the big array. Change this:
Code: Select all
for($i=0; $i<count($matches); $i++) {
to this:
Code: Select all
for($i=0; $i<count($matches[1]); $i++) {
or better yet:
*shrugs* I just like to use foreach when I can.
Also, you
could change:
Code: Select all
$result = preg_match_all("/<img(.*?)>/si", $body, $matches);
if ($result) {
to simply:
Code: Select all
if (preg_match_all("/<img(.*?)>/si", $body, $matches)) {
Posted: Wed May 04, 2005 4:22 pm
by Stryks
See .. now I knew it would be something I was missing. Sometimes I just cant see the answer for looking.
Thanks alot for the replies. Been a massive help.
Cheers

Posted: Wed May 04, 2005 11:52 pm
by Stryks
Just in case this is useful to someone someday, this is what I finally arrived at.
Code: Select all
// Process the body
if (preg_match_all("/<img(.*?)>/si", $body, $matches)) {
for($i=0; $i<count($matches[0]); $i++) {
// find and retain all important parameters for current link
if(preg_match_all("/(height=['\"]|width=['\"]|class=['\"])(\S+?)['\"]/i", $matches[0][$i], $parameters)) {
$parameters = " " . implode(" ", $parameters[0]);
} else {
$parameters = "";
}
// find and process the src - link each unique image once and set reference
if(preg_match("/src=['\"](\S+?)['\"]/i", $matches[0][$i], $full_path)) {
$prev_used = false;
// Check each src against the previous ones to make sure the same image isn't linked twice
for($x=0; $x<$i; $x++) {
preg_match("/src=['\"](\S+?)['\"]/i", $matches[0][$x], $test_var);
if ($full_path[1] == $test_var[1]) {
$prev_used = true;
break;
}
}
// Link all unique entries. Non unique ones are linked to the previously found link.
if (!$prev_used) {
$URL = explode("/", $full_path[1]);
$mail->AddEmbeddedImage("/absolut/email/" . $URL[count($URL)-1], "my-attach-" . $x, $URL[count($URL)-1]);
}
}
// replace the original img with the new tags, including any other tags found
$body = str_replace($matches[0][$i], '<img alt="PHPMailer" src="cid:my-attach-' . $x . '"' . $parameters . '>', $body);
}
}
Again, thanks for all the help.