Any questions involving matching text strings to patterns - the pattern is called a "regular expression."
Moderator: General Moderators
-
deepblue_2006
- Forum Newbie
- Posts: 1
- Joined: Sat Aug 23, 2008 3:23 pm
Post
by deepblue_2006 »
Hi.
I need to catch the address in the hyperlinks. But, the hyperlinks must have the
class="link" property.
sample html source:
Code: Select all
<a href="www.sample1.com" class="link" onmousedown="return click();">sample text 1</a>
<a href="www.sample2.com" onmousedown="return click();">sample text 2</a>
<a href="www.sample3.com" class="link" onmousedown="return click();">sample text 3</a>
<a href="www.sample4.com" onmousedown="return click();">sample text 4</a>
result must be like this:
Code: Select all
http://www.sample1.com
http://www.sample3.com
Can you help me?
Best regards.
-
prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Post
by prometheuzz »
Code: Select all
<?php
$tests = array(
'<a href="www.sample1.com" class="link" onmousedown="return click();">sample text 1</a>',
'<a href="www.sample2.com" onmousedown="return click();">sample text 2</a>',
'<a href="www.sample3.com" onmousedown="return click();" class="link">sample text 3</a>',
'<a href="www.sample4.com" onmousedown="return click();">sample text 4</a>',
'<a class="link" href="www.sample5.com" onmousedown="return click();">sample text 3</a>'
);
foreach($tests as $t) {
if(preg_match('/^(?=.*?class="link").*?href="([^"]++)"/', $t, $matches)) {
echo "$matches[1]\n";
}
}
?>
-
GeertDD
- Forum Contributor
- Posts: 274
- Joined: Sun Oct 22, 2006 1:47 am
- Location: Belgium
Post
by GeertDD »
I'd watch out with multiple .*? segments in one regex. Here is why:
http://regexadvice.com/blogs/regex_jedi ... 2100_.aspx
Sometimes it is better to do things in multiple steps:
- Match all the opening <a> elements and put them in an array.
- Loop through them looking for the class="link".
- If that class is found then extract the href.
I know this may sound a bit more complicated but I would not be surprised if it turned out to be faster still.
Note that depending on the source/context you're parsing, the second step could be dealt with by a strpos() or stripos() function.