Page 1 of 1

search for img tags, getting their contents and replacing it

Posted: Sun Nov 08, 2009 2:57 am
by smartie_on_computer
Hey, my first post here so here it goes...

I'm an experienced programmer but sadly i can't get my head to wrap around regex.

I've built my site that uses TinyMCE for the editor because the site uses a CMS and when pages get loaded with images that have been rezised they look rather pixelated:

Code: Select all

<img style="float: right;" src="images/uploads/DSC00159.JPG" alt="LCD Screen" height="243" width="182">
so i pulled out an old ImageScale script i made a few years ago and i want to turn the above in to this to make the images cleaner and load faster:

Code: Select all

<img style="float: right;" src="imagescale.php?img=images/uploads/DSC00159.JPG&w=182&w=243" alt="LCD Screen">
If you want to have a look at my site, feel free
http://www.davevaughan.com/roman (yeah its hosted on my brothers site atm)

Any help would be appreciated

Cheers
Roman

Re: search for img tags, getting their contents and replacing it

Posted: Sun Nov 08, 2009 9:40 pm
by ridgerunner
Here is a regex that should do the trick:

Code: Select all

$text = preg_replace('%
    (<img[^>]+?src\s*=\s*")  # capture (1) the <img tag up to the src = URL
    ([-./\w]+?\.jpe?g)       # capture (2) relative URL (no query ? or fragment #)
    ("[^>]*?)                # capture (3) everything up to the height attribute
    \s*height\s*=\s*"?       # match and discard height attribute name =
    (\d+)                    # capture (4) height attribute value
    "?([^>]*?)               # capture (5) everything up to the width attribute
    \s*width\s*=\s*"?        # match and discard width attribute name =
    (\d+)                    # capture (6) width attribute value
    "?([^>]*?)               # capture (7) everything up to the end of tag
    /?>                      # match end of tag (optional self closing /)
    %ix', 
    '$1imagescale.php?img=$2?w=$6&h=$4$3$5$7 />', $text);
Notes:
  • The URL filename must end in .JPG or .JPEG
  • The src, height and width attributes must occur in order
  • Other attributes can be placed anywhere and are preserved (but moved)
  • The height and width attribute values can occur with or without quotes
  • The source IMG tag may or may not be self-closing i.e. ends with: "/>"
  • The resulting target IMG tag will always be self-closing />
  • The src, height and width attributes may have spaces around the equals signs
The above regex takes these examples:

Code: Select all

<img style="float: right;" src="images/uploads/DSC00159.JPG" alt="LCD Screen" height="243" width="182">
<img style="float: right;" src="images/uploads/DSC00159.JPG" height="243" alt="LCD Screen" width="182">
<img alt="LCD Screen" style="float: right;" src="images/uploads/DSC00159.JPG" height="243" width="182">
<img src="images/uploads/DSC00159.JPG" height="243" width="182">
<img src="images/uploads/DSC00159.JPG" height=243 width=182>
<img src = "images/uploads/DSC00159.JPG" height = "243" width = "182">
<img src = "images/uploads/DSC00159.JPG" height = 243 width = 182>
<img src = "images/uploads/DSC00159.JPG" height = 243 width = 182  />
and produces these results:

Code: Select all

<img style="float: right;" src="imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" alt="LCD Screen" />
<img style="float: right;" src="imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" alt="LCD Screen" />
<img alt="LCD Screen" style="float: right;" src="imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" />
<img src="imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" />
<img src="imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" />
<img src = "imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" />
<img src = "imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243" />
<img src = "imagescale.php?img=images/uploads/DSC00159.JPG?w=182&h=243"   />
Hope this helps!

Re: search for img tags, getting their contents and replacing it

Posted: Sun Nov 08, 2009 10:11 pm
by smartie_on_computer
Thanks for this but it doesnt work well enough :(

my news page will have content that looks like this:

Code: Select all

<table width="100%" border="0" align="center" cellpadding="0" cellspacing="0" class="layout">
  <tr>
    <td align="left"><h1 class="header">Finally</h1><h4>By Admin, on Sun, 8 Nov 2009 4:03 pm</h4>
<div class="content"><p>Alright!'<img style="float: right;" src="images/uploads/DSC00159.JPG" alt="" width="182" height="243" /></p>  <p>Finally after a few days of programming, i've been working on a content management system that allows me to edit any content on this site where ever i am. so now im able to upload images, post or edit my projects and even add new pages to the site easily.</p>  <p>So as a treat, i've added a few pictures of my latest project and even a youtube video. <a id="plink" href="projects.php?p=4">check it out!</a></p></div><br />
 
<h1 class="header">Welcome!</h1><h4>By Admin, on Sat, 24 Oct 2009 5:05 pm</h4>
<div class="content"><p>Welcome, currently this site is still under construction but please feel free to look around.</p>  <p>Cheers<br />Roman</p></div><br />
</td></tr>
</table>
when i try that through your regex, it simply doesn't pick up the image :(

oh and btw i changed

Code: Select all

$1imagescale.php?img=$2?w=$6&h=$4$3$5$7 />
to

Code: Select all

$1imagescale.php?img=$2&w=$6&h=$4$3$5$7 />
because after the iamge was passed, there was '?' instead of '&'

Cheers
Roman

edit: opps left some admin stuff in it

Re: search for img tags, getting their contents and replacing it

Posted: Mon Nov 09, 2009 9:52 am
by ridgerunner
As stated in the notes, for my first regex to work, the order of the src, height and width must be src, height and width! The example that you just cited has a different order: src, width and height (so naturally, the regex does not match). For this case you can run this modified version.

Code: Select all

$text = preg_replace('%
    (<img[^>]+?src\s*=\s*")  # capture (1) the <img tag up to the src = URL
    ([-./\w]+?\.jpe?g)       # capture (2) relative URL (no query ? or fragment #)
    ("[^>]*?)                # capture (3) everything up to the width attribute
    \s*width\s*=\s*"?        # match and discard width attribute name =
    (\d+)                    # capture (4) width attribute value
    "?([^>]*?)               # capture (5) everything up to the height attribute
    \s*height\s*=\s*"?       # match and discard height attribute name =
    (\d+)                    # capture (6) height attribute value
    "?([^>]*?)               # capture (7) everything up to the end of tag
    /?>                      # match end of tag (optional self closing /)
    %ix', 
    '$1imagescale.php?img=$2?w=$4&h=$6$3$5$7 />', $text);
This one handles the case when the attribute order is src-width-height. If you apply these two replace regexs in series one after the other, all your IMG tags should be fixed - assuming of course that both the height and width attributes always appear after the src attribute! (Note that there are six possible orders that these three (src, height and width) attributes may fall in and a regex could be crafted to handle each of the six cases.)

Re: search for img tags, getting their contents and replacing it

Posted: Mon Nov 09, 2009 12:50 pm
by smartie_on_computer
Edit: opps, double post

Re: search for img tags, getting their contents and replacing it

Posted: Mon Nov 09, 2009 12:52 pm
by smartie_on_computer
Wow thank you so much!

It works so well and does exactly what i wanted it to do.
you can now see it working on my site http://www.davevaughan.com/roman/

Cheers!
Roman

P.S. I put a nice thank you note on my site for you :)