Page 1 of 1

Regex help

Posted: Tue May 25, 2004 1:58 am
by maqmus
So far I can understand (I don't know regex) the pattern:

Code: Select all

preg_match("#="?((/?їa-z0-9_-]+/?)+\.(html|gif|jpg|swf))"?#i",$line,$matches)
searches for a match that beggins with equal sign and double quote, then the file name + sufix.

It works for <img src="somefile.jpg">, but how could I write it to match beggining with backslash and the the file name + sufix, for this case:

<img src="http://www.mydomain/subfolder/somefile.jpg">

Posted: Tue May 25, 2004 2:07 am
by feyd

Code: Select all

preg_match("#="?((.*?)+\.(html|gif|jpg|swf))"?#i",$line,$matches);
finds

Code: Select all

Array
(
    &#1111;0] =&gt; ="http://www.mydomain/subfolder/somefile.jpg"
    &#1111;1] =&gt; http://www.mydomain/subfolder/somefile.jpg
    &#1111;2] =&gt; 
    &#1111;3] =&gt; jpg
)

Posted: Tue May 25, 2004 10:38 am
by maqmus
Ok, but I mean to get just "somefile.jpg".

Posted: Tue May 25, 2004 11:26 am
by feyd
okay.. screwing around, since I knew there was a regex that'd do this:

Code: Select all

<?php

$lines = array(
'<img src="http://www.blah.bom/dir/dir/dir/somefile.jpg">',
'<a href="/dir/dir/somefile.gif">',
'<script src=somefile.html><img src="http://some/path/joy_to_the_finding.swf"></script>'
);
foreach($lines as $line)
{
	preg_match_all("#[^=\\/"'']*?\.(html|gif|jpe?g|swf)#",$line,$matches);
	echo '<pre>'.print_r($matches,true).'</pre>'."\n";
}

?>
output

Code: Select all

Array
(
    &#1111;0] =&gt; Array
        (
            &#1111;0] =&gt; somefile.jpg
        )

    &#1111;1] =&gt; Array
        (
            &#1111;0] =&gt; jpg
        )

)

Array
(
    &#1111;0] =&gt; Array
        (
            &#1111;0] =&gt; somefile.gif
        )

    &#1111;1] =&gt; Array
        (
            &#1111;0] =&gt; gif
        )

)

Array
(
    &#1111;0] =&gt; Array
        (
            &#1111;0] =&gt; somefile.html
            &#1111;1] =&gt; joy_to_the_finding.swf
        )

    &#1111;1] =&gt; Array
        (
            &#1111;0] =&gt; html
            &#1111;1] =&gt; swf
        )

)

Posted: Tue May 25, 2004 11:46 am
by maqmus
Thanks Feyd. Finally it's working right!

Posted: Tue May 25, 2004 11:49 am
by tim
great regex feyd.

might want to add

.pdf, seems they are getting popular now-a-days

Posted: Tue May 25, 2004 11:52 am
by feyd
true, .pdf is a nice format, when encoded right ;)

might also add .ps or rather .e?ps, possibly .svg since that's coming down the pipe so to speak.

Posted: Tue May 25, 2004 11:56 am
by maqmus
Thanks Feyd. Finally it's working right!

Posted: Tue May 25, 2004 12:01 pm
by feyd
might potentially need to change

Code: Select all

preg_match_all("#[^=\\/"'']*?\.(html|gif|jpe?g|swf)#",$line,$matches); 
// to
preg_match_all("#[^\s=\\/"'']*?\.(html|gif|jpe?g|swf)#",$line,$matches);
so it strips out leading spaces too.. although you could just trim() the resultant strings as well. :)