a link
Moderator: General Moderators
- shiznatix
- DevNet Master
- Posts: 2745
- Joined: Tue Dec 28, 2004 5:57 pm
- Location: Tallinn, Estonia
- Contact:
a link
i need to find the first occorance of the phrase GET HTML which is going to be a link then i need to get the link that the GET HTML is being linked to. errr im absolutly aweful at regex so some help please.
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Code: Select all
$data = get_content_from_somewhere();
preg_match('@<a\s+[^>]*?\bhref="([^"]+)"[^>]*?>GET HTML</a>@is', $data, $matches);
print_r($matches);@@ - Delimiters
<a\s+ -- Find the start of an <a> tag followed by at least one whitespace
[^>]*? -- Allow some other attributes (javascript etc) to come before "href"
\b -- Edge of a word (href)
href=" -- Just plain string (the href part)
([^"]+) -- Any string of characters other than double quotes -- extracted (the link itself)
"[^>]*?> -- The closing quote, any string of characters other than ">" zero or more times (other attributes which may, or may not be there
The rest should be obvious.
The modifiers "is" mean case insensitve and ignore whitespace
Actually s modifier means that . matches a newline. It is usually used in conjunction with m for multiline regex processing.d11wtq wrote: The modifiers "is" mean case insensitve and ignore whitespace
Perhaps you were thinking of x which disables whitespace parsing and allows for comments in the regex?
PCRE Pattern Modifiers
- shiznatix
- DevNet Master
- Posts: 2745
- Joined: Tue Dec 28, 2004 5:57 pm
- Location: Tallinn, Estonia
- Contact:
ok sigh that worked prefect thanks. i tried using some of that stuff the "breakdown" to match the first link in a textarea box to no avail. i know this is garbage but this is the kinda stuff iv been trying.
you guessed it, no luck
Code: Select all
preg_match('@<a\s+[^>]*?\bhref="([^"]+)"[^>]*?>[^>]*?</a>[^>]*?</textarea>@is', $incoming, $matches);