Page 1 of 1

Need regex for globbing function

Posted: Sun Apr 16, 2006 10:03 pm
by alex.barylski
So i've written a recursive globbing funciton...

Code: Select all

'/.+/'
The above is my regex to match EVERY file or folder on the system, recursively searching directories, etc...

Heres the problem...

If I wanted to narrow that search down, to say PHP and GIF files, how would I do that?

Here is what I have right now:

Code: Select all

'/.+\.(php|gif)?/'
If your a regex hack you see the problem...

If finds files like:
- test.php.dat

Even though .dat is NOT what i'm looking for :)

Also, in order for the recursion to work, it needs to match folders as well, which obviously don't likely have PHP extensions...

So I need a regex which will match any directory name or filename but also limit result to only certina file types...aka extensions!!!

Any ideas???

Cheers :)

Posted: Sun Apr 16, 2006 10:08 pm
by feyd

Code: Select all

/^.+?(?:\.(?:php|gif))?$/
:?:

Posted: Mon Apr 17, 2006 6:42 am
by Chris Corbyn
I reckon 20% of our posts in this forum relate to pattern greediness....

Posted: Mon Apr 17, 2006 8:45 pm
by alex.barylski
I tried that regex Feyd and it didn't work :(

The globbing function I use only needs to pattern match against the name of a directory or file, not a full path...

???

I have very little experience of regex, so I can't even begin to think what might be wrong with it...

The problem is, no matter what extensions I add or remmove...it seems EVERY file/folder is getting pulled???

Cheers :)

Posted: Mon Apr 17, 2006 8:57 pm
by feyd
Where are you using it? More code maybe?

Try removing the last question mark in my regex.

Posted: Mon Apr 17, 2006 9:21 pm
by alex.barylski
feyd wrote:Where are you using it? More code maybe?

Try removing the last question mark in my regex.

Code: Select all

function _file_glob($path, $re_pattern)
{
  $arr_names = array(); // Array of file/folder paths which meet glob criteria
  
  if(is_dir($path)){ 
    if($dh = opendir($path)){ 

      clearstatcache();
      while(($name = readdir($dh)) !== false){
        if($name != '.' && $name != '..'){ 
        
          //
          // Check file/folder name against a regex pattern
          //echo $name.'<br>';
         if(preg_match($re_pattern, $name)){
           // ...
         }
I am aware of glob() but my function goes way above and beyond the capabilities of glob() thus the custom function.

I'm positive the problem lies with regex...

What I have concluded is that I need a function which:
1) Matches a directory name or file name (minus extension)
2) And optionally matches an extension(s) list in brackets (php|gif) by starting at the END of the string and counting backwards (if possible?)

This way ANY file or folder is matched ALWAYS and optionally matching extensions...

S*ite...I just realized maybe the problem lies within my function....because I don't distinguish between file/folders in my code before matching...a path is a path is a path... :P you know what I'm saying?

So using regex that i've describe above....it would return folders named 'myimages.gif' as well as files with GIF extensions... :(

Ok, so a re-write is in order :)

Having been exposed to my snippet of code above, can you think of a effective way of solving this problem?

I could check $path.'/'.$name for it's type I suppose and use a different preg_match pattern (for file or folder) but then wouldn't that require me passing in two different patterns?

I want to avoid calling the function twice or passing seperate patterns if possible... :?

Cheers :)

Posted: Tue Apr 18, 2006 6:18 pm
by alex.barylski
Ok, so I've narrowed down my problem even more...

What would work, if this is possible in regex...

Is a regex, which wasn't greedy BUT matched ANY valid name for a file or folder

However, optionally matched against a list of extensions, but only if the greedy modifier was included inside the regex...

The optional matching should work by starting at END of string and looking for extension by working backwards until extension and period located...

Code: Select all

/.+ (.(bmp|png|jpg|jpeg))/g
Note the g modifier

I would add that modifier to the regex dynamically before execution...thus the regex would then match only FILE names...and not folders, but if I remove the g modifier, it will match both file and folder names because NO file extension check is done...

Any ideas??? :)

Posted: Tue Apr 18, 2006 8:13 pm
by feyd
There's no "g" modifier in PHP. You'll get a warning if you use it. To match end of string this works:

Code: Select all

#\.(?:png|bmp|gif|jpe?g)$#is