I had a good read of the manuals etc. and the domain name example is not sufficient. Basically I am trying to find
anything in between the quotes that follow an href.
This seems to work
Code: Select all
$pattern = "/(?<=href=")ї^"]*(?=")/";
// int preg_match_all(string pattern, string subject, array matches)
$success = preg_match_all($pattern, $string, $matches);
$matches = $matchesї0];
So whats going on here?
Code: Select all
$pattern =
"/ // start of expression
(?<= // find a string that starts with ..
href=" // .. href then open double quotes (escaped)
) // close 'start with' directive
ї^"] // match any valid characters except quote
* // any number of times
(?= // ending with
" // close double quotes
) // close 'ending with' directive
/" // close expression
preg_match_all matches all
patterns from the given
string and stores in array
matches.
$matches will hold an array of arrays (depending on options), but only matches[0] will hold what we want, so reassign it to matches.
Disadvantages:
Matches all href attributes, including style sheets.
Only matches attributes enclosed in double quotes, can change to single quotes, but not both.
Hope this will be of help to someone, or can be improved upon.
Thanks.