Page 1 of 1

back reference as a named capture

Posted: Sat Mar 14, 2009 2:29 pm
by SidewinderX

Code: Select all

<?php
 
$string = "123456-string";
preg_match("/^(.*?):(.*?)$/", $string, $matches);
 
print_r($matches);
 
?>
$matches will obviously contain:

Code: Select all

Array
(
    [0] => 123456:string
    [1] => 123456
    [2] => string
)


My desired result is simply:

Code: Select all

Array
(
    [string] => 123456
)


After I get the matches I can easily do this with some array functions, but I would rather not go through the extra steps if I can do it all with preg_match. So I have two questions.

1. If I use a named capture, can I avoid the duplicate results?

Code: Select all

/^(?P<name>.*?):(.*?)$/
RESULTS IN:
Array
(
    [0] => 123456:string
    [name] => 123456
    [1] => 123456
    [2] => string
)
I WOULD LIKE:
Array
(
    [0] => 123456:string
    [name] => 123456
    [2] => string
)
2. Is it possible, and if so how can I use a back reference as a named capture? The following is wrong but demonstrates what I am after:

Code: Select all

preg_match("/^(?P<$1>.*?):(.*?)$/", $string, $matches);

Re: back reference as a named capture

Posted: Sun Mar 15, 2009 5:04 am
by GeertDD
1. Three regexes that only match the desired part, e.g. "123456". Comments in the code.

Code: Select all

// Your original regex which matches too much, and is the slowest of all four.
preg_match('/^(.*?):(.*?)$/', $string, $matches); // 0,37s (for 100,000 loops)
 
// The key is to get rid of all needless capturing parentheses
preg_match('/^.*?(?=:.*?$)/', $string, $matches); // 0,29s
 
// Optimization #1: using a possessive negative character class instead of a lazy dot
preg_match('/^[^:]*+(?=:.*?$)/', $string, $matches); // 0,24s
 
// Optimization #2: stop bothering about whatever comes after the colon
preg_match('/^[^:]*+(?=:)/', $string, $matches); // 0,20s
2. Backreferences can't be used as name for the captures, but you don't need them anymore now, do you?