A regex to match JS regexes

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
paulhan
Forum Newbie
Posts: 3
Joined: Tue Sep 04, 2007 12:17 pm

A regex to match JS regexes

Post by paulhan »

Hi,
I'd really appreciate anybodies help right now, I'm at my wit's end. I'm building a function to minify javascript, and I'm getting caught on a regex that matches js regexes. What I've come up with so far is

Code: Select all

if (preg_match("|(/.*?(?<!\\\\)/)|",$data, $var2, PREG_OFFSET_CAPTURE, $x)){
              $out .= $var2[0][0];//Add match to the output  //
              $x += (strlen($var2[0][0]))-1;//Advance x
            }
This finds almost all of them, except /\\/, an escaped escape char, followed directly by the forward slash closing the regex. I did try

Code: Select all

"|(/.*?(?<![^\\\\]\\\\)/)|"
but I get "lookbehind not a fixed length", even though this did seem to have the desired effects in RegexBuddy. I don't need to find the modifiers, just the regex literal. Can anyone help,

Thanks in Advance,

Paul.
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

I'd suggest

Code: Select all

~/(?>\\\\.|.)+/[a-z]*~
paulhan
Forum Newbie
Posts: 3
Joined: Tue Sep 04, 2007 12:17 pm

Post by paulhan »

Hi Michael,
Thank you so much for replying. The regex is really beautiful. It catches everything that it should. Weirdly, the php interpretation of pcre differs a bit from RegexBuddy, and these things are cropping up when I use preg_match.

Starts the regex as it should but matches all the way up to the divisor, and preserves the two spaces

(parseFloat( elem.filter.match(/opacity=([^)]*)/)[1] ) / 100).toString() : "";

On this, preg_match behaves properly, but RegexBuddy captures between the first and last forward slash. The string comes out compressed exactly as it should
msie: /msie/.test(b) && !/opera/.test(b),

On this one preg_match chews up the forward slash, but RegexBuddy doesn't match it at all
return ((-Math.cos(p*Math.PI)/2) + 0.5) * diff + firstNum;

I have tried adapting your regex, but anything I seem to do either doesn't change anything or changes them for the worst. I'm so close, and if I can get it to work on a highly optimized piece of code like jQuery, it will work anywhere. It's an open source project.

Thanks again,

Paul.


/(?>\\.|.)+/[a-z]*
User avatar
stereofrog
Forum Contributor
Posts: 386
Joined: Mon Dec 04, 2006 6:10 am

Post by stereofrog »

Oh, should be non-greedy of course

Code: Select all

~/(?>\\\\.|.)+?/[a-z]*~
Who's Michael? ;)
paulhan
Forum Newbie
Posts: 3
Joined: Tue Sep 04, 2007 12:17 pm

That did it

Post by paulhan »

feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]


Hi StereoFrog (sorry about Michael)
That did it, once I'd refactored my code. It seemed the more I took out of it, the better it worked. Anyway, here's my javascript minifier in 20 lines of code. It works perfectly on jQuery, a highly optimized piece of code, but I've got to test it on other files before I'm sure.

Code: Select all

function compress1($data1){
    $out=""; $chr="";$nxt="";$pre="";$str="";
    $mcomms = array("!/\*.*?\*/!s", "|\t+|m", "|((?<![\"=:\\\\])//.*$)|m");
    $data = preg_replace($mcomms, " ", $data1);//Get rid of all the easy stuff
    for ($x=0, $dl=strlen($data);$x < $dl; $x++){//Now one char at a time
      $chr = $data{$x}; //Get current char in stream
      $ord = ord($chr); //Get it's ordinal
      if ($x < $dl-1){$nxt = $data{$x+1};}//Get next char in stream
      if($ord == 32 || $ord==10){//Space, and newline. Only significant where letters are on both sides
        if (preg_match("/[a-z\$_]/i", $pre) && preg_match("/[a-z\$_]/i", $nxt)){//If the two characters on either side are letters
          $out .= " ";//Add a space
        }
      }else{//If it's definitely a string
        if (($chr == '"' || $chr == "'") && preg_match("~(['\"].*?(?<!\\\\)['\"])~",$data, $var1, PREG_OFFSET_CAPTURE, $x)){
          $str = $var1[0][0]; //If it is definitely a regex
        }elseif ($chr == "/" && preg_match("~[,\\|\\[\\(=]~", $pre) && preg_match("~(/(?>\\\\.|.)+?/[a-z]*)~",$data, $var2, PREG_OFFSET_CAPTURE, $x)){//Should mean it is definitely a regex
          $str = $var2[0][0];
        }
        if ($str){//One of the above matched
          $out .= $str;//Add match to the output
          $x += (strlen($str))-1;//Advance x
          $str = "";
        }else{
          $out .= $chr;//Just add the char to the output
        }
      }
      $pre = ($out)?$out{(strlen($out))-1}:"";//Get previous char
    }
    echo $out;//Output result
  }
Thanks again,

Paul


feyd | Please use

Code: Select all

,

Code: Select all

and [syntax="..."] tags where appropriate when posting code. Your post has been edited to reflect how we'd like it posted. Please read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url] to learn how to do it too.[/color]
Post Reply