Regex help

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
maqmus
Forum Commoner
Posts: 30
Joined: Mon Mar 08, 2004 1:10 pm

Regex help

Post by maqmus »

So far I can understand (I don't know regex) the pattern:

Code: Select all

preg_match("#="?((/?їa-z0-9_-]+/?)+\.(html|gif|jpg|swf))"?#i",$line,$matches)
searches for a match that beggins with equal sign and double quote, then the file name + sufix.

It works for <img src="somefile.jpg">, but how could I write it to match beggining with backslash and the the file name + sufix, for this case:

<img src="http://www.mydomain/subfolder/somefile.jpg">
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Code: Select all

preg_match("#="?((.*?)+\.(html|gif|jpg|swf))"?#i",$line,$matches);
finds

Code: Select all

Array
(
    &#1111;0] =&gt; ="http://www.mydomain/subfolder/somefile.jpg"
    &#1111;1] =&gt; http://www.mydomain/subfolder/somefile.jpg
    &#1111;2] =&gt; 
    &#1111;3] =&gt; jpg
)
User avatar
maqmus
Forum Commoner
Posts: 30
Joined: Mon Mar 08, 2004 1:10 pm

Post by maqmus »

Ok, but I mean to get just "somefile.jpg".
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

okay.. screwing around, since I knew there was a regex that'd do this:

Code: Select all

<?php

$lines = array(
'<img src="http://www.blah.bom/dir/dir/dir/somefile.jpg">',
'<a href="/dir/dir/somefile.gif">',
'<script src=somefile.html><img src="http://some/path/joy_to_the_finding.swf"></script>'
);
foreach($lines as $line)
{
	preg_match_all("#[^=\\/"'']*?\.(html|gif|jpe?g|swf)#",$line,$matches);
	echo '<pre>'.print_r($matches,true).'</pre>'."\n";
}

?>
output

Code: Select all

Array
(
    &#1111;0] =&gt; Array
        (
            &#1111;0] =&gt; somefile.jpg
        )

    &#1111;1] =&gt; Array
        (
            &#1111;0] =&gt; jpg
        )

)

Array
(
    &#1111;0] =&gt; Array
        (
            &#1111;0] =&gt; somefile.gif
        )

    &#1111;1] =&gt; Array
        (
            &#1111;0] =&gt; gif
        )

)

Array
(
    &#1111;0] =&gt; Array
        (
            &#1111;0] =&gt; somefile.html
            &#1111;1] =&gt; joy_to_the_finding.swf
        )

    &#1111;1] =&gt; Array
        (
            &#1111;0] =&gt; html
            &#1111;1] =&gt; swf
        )

)
User avatar
maqmus
Forum Commoner
Posts: 30
Joined: Mon Mar 08, 2004 1:10 pm

Post by maqmus »

Thanks Feyd. Finally it's working right!
User avatar
tim
DevNet Resident
Posts: 1165
Joined: Thu Feb 12, 2004 7:19 pm
Location: ohio

Post by tim »

great regex feyd.

might want to add

.pdf, seems they are getting popular now-a-days
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

true, .pdf is a nice format, when encoded right ;)

might also add .ps or rather .e?ps, possibly .svg since that's coming down the pipe so to speak.
User avatar
maqmus
Forum Commoner
Posts: 30
Joined: Mon Mar 08, 2004 1:10 pm

Post by maqmus »

Thanks Feyd. Finally it's working right!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

might potentially need to change

Code: Select all

preg_match_all("#[^=\\/"'']*?\.(html|gif|jpe?g|swf)#",$line,$matches); 
// to
preg_match_all("#[^\s=\\/"'']*?\.(html|gif|jpe?g|swf)#",$line,$matches);
so it strips out leading spaces too.. although you could just trim() the resultant strings as well. :)
Post Reply