conditional regex with Parens in the test string

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
dmccmd
Forum Newbie
Posts: 1
Joined: Mon Nov 16, 2009 10:22 am

conditional regex with Parens in the test string

Post by dmccmd »

Hello,
I am trying to parse a string that contains an ip address \ dns name with or without a port. I have the code to get the ip address / dns name conditionally but my issue is that after the port there could be a string containing a paren "(".

My regex is (?([a-zA-Z0-9.-]+\/+[a-zA-Z0-9.-])(.*)\/(.*)|(.*)())\sby\saccess-group\s\"(.*)\"

My sample strings are:
mail.store.tn/456 (type 8, code 0) by access-group "OUT_IN" [0x0, 0x0]
mail.business.com (type 0, code 0) by access-group "ptv_outside" [0xb6baabe9, 0x0]
111.222.0.44 by access-group "inside" [0x0, 0x0]

What I need to capture in the groups are the ip / dns names, the ports if they exist and whatever is in the "". My problem is that with the above regular expression the "(Type X, code 0)" in the first two test strings are attached to the port / dns name and I want them removed. I have tried changing the conditional "Yes" to be (.*)\/(.\d) to only capture the digits after but that is failing (I am using Expresso to test my regex).

Any help would be greatly appreciated.
Thanks.
User avatar
ridgerunner
Forum Contributor
Posts: 214
Joined: Sun Jul 05, 2009 10:39 pm
Location: SLC, UT

Re: conditional regex with Parens in the test string

Post by ridgerunner »

No need for any conditional. Try this one:

Code: Select all

<?php
$text = 'mail.store.tn/456 (type 8, code 0) by access-group "OUT_IN" [0x0, 0x0]';
$re_long = '%
^                 # anchor to start of string
([a-zA-Z0-9.-]+)  # capture host/domain in group 1
(?::([0-9]+))?    # capture optional port in group 2
(?:[^\s]*)?       # match and discard optional path
[^"\r\n]*         # match and discard up to quoted string
"([^"]*)"         # capture quoted string in group 3
%mx';
$re_short = '/^([a-zA-Z0-9.-]+)(?::([0-9]+))?(?:[^\s]*)?[^"\r\n]*"([^"]*)"/m';
 
if (preg_match($re_short, $text, $matches)) {
    echo ("Domain =\"{$matches[1]}\", ");
    echo ("Port =\"{$matches[2]}\", ");
    echo ("Quoted string =\"{$matches[3]}\"\r\n");
} else {
    echo ('No match!');
}
?>
Post Reply