Page 1 of 1

catch the hyperlinks by class attribute

Posted: Sat Aug 23, 2008 3:27 pm
by deepblue_2006
Hi.
I need to catch the address in the hyperlinks. But, the hyperlinks must have the class="link" property.

sample html source:

Code: Select all

<a href="www.sample1.com" class="link" onmousedown="return click();">sample text 1</a>
<a href="www.sample2.com" onmousedown="return click();">sample text 2</a>
<a href="www.sample3.com" class="link" onmousedown="return click();">sample text 3</a>
<a href="www.sample4.com" onmousedown="return click();">sample text 4</a>
result must be like this:

Code: Select all

http://www.sample1.com
http://www.sample3.com
Can you help me?
Best regards.

Re: catch the hyperlinks by class attribute

Posted: Sat Aug 23, 2008 4:03 pm
by prometheuzz

Code: Select all

<?php
$tests = array(
  '<a href="www.sample1.com" class="link" onmousedown="return click();">sample text 1</a>',
  '<a href="www.sample2.com" onmousedown="return click();">sample text 2</a>',
  '<a href="www.sample3.com" onmousedown="return click();" class="link">sample text 3</a>',
  '<a href="www.sample4.com" onmousedown="return click();">sample text 4</a>',
  '<a class="link" href="www.sample5.com" onmousedown="return click();">sample text 3</a>'
);
foreach($tests as $t) {
  if(preg_match('/^(?=.*?class="link").*?href="([^"]++)"/', $t, $matches)) {
    echo "$matches[1]\n";
  }
}
?>

Re: catch the hyperlinks by class attribute

Posted: Sun Aug 24, 2008 2:45 am
by GeertDD
I'd watch out with multiple .*? segments in one regex. Here is why: http://regexadvice.com/blogs/regex_jedi ... 2100_.aspx

Sometimes it is better to do things in multiple steps:
  • Match all the opening <a> elements and put them in an array.
  • Loop through them looking for the class="link".
  • If that class is found then extract the href.
I know this may sound a bit more complicated but I would not be surprised if it turned out to be faster still.

Note that depending on the source/context you're parsing, the second step could be dealt with by a strpos() or stripos() function.