which is valid but your regex won't match it because of the space before the closing bracket. So allow for it in your regex by adding \s*.
#2 By default a dot does match any character except for newlines! Your html contains newlines, though. Add the s modifier to the regex in order to make the dot match newlines as well.
which is valid but your regex won't match it because of the space before the closing bracket. So allow for it in your regex by adding \s*.
#2 By default a dot does match any character except for newlines! Your html contains newlines, though. Add the s modifier to the regex in order to make the dot match newlines as well.
nirali35 wrote:Thanks... it did worked... but it just parsed only the first link.
First of all, preg_match() will only match once and then quit. You figured that out yourself already and used preg_match_all() instead. Good.
Secondly, look closely at the regex. It will match the whole string at once because the <a> tag is surrounded by .*? and .*+ which will consume all text before the first link and all text after it. See?
GeertDD wrote:...
Secondly, look closely at the regex. It will match the whole string at once ...
Ha, I keep forgetting that preg_match(...) does not have to match the entire string! Java's String.matches(...) is the cause of this (and my own amnesia, of course)!
; )
nirali35 wrote:Thanks... it did worked... but it just parsed only the first link.
First of all, preg_match() will only match once and then quit. You figured that out yourself already and used preg_match_all() instead. Good.
Secondly, look closely at the regex. It will match the whole string at once because the <a> tag is surrounded by .*? and .*+ which will consume all text before the first link and all text after it. See?
nirali35 wrote:Getting closer my friend. Thank you very much
But two things:
1. I don't want all the links on the page.
2. This expression doesn't give us a title!
1. Could you clarify the conditions for which links to match and which not?
2. Fair enough. I guess I removed a bit too much earlier on. This regex will match the link title/text again, in $matches[2]. It is recommended to trim() these values, though.
Some things are better handled by other PHP functions. Using strpos() in this case does not seem like bad practice to me. It allows you to use a more simple and faster regex to match all the links.
You could implement it in the regex itself as well, of course. Below is a modified regex that will only match links that contain either "task=view" or "itemid=123". That is what you want, right?
You've given a couple of demo's on how to match url's and title's. You have not asked anything about these regexes, so I assume everything is clear to you. ; )
So, why don't you give this one a try yourself? If you get stuck, you can post back here, ok?
GeertDD wrote:...
Below is a modified regex that will only match links that contain either "task=view" or "itemid=123". That is what you want, right?
...
I believe he's only interested in url's that contain both substrings, but with all of the example's given to him/her, surely s/he is able to adjust it to his/her needs!
; )
prometheuzz wrote:I believe he's only interested in url's that contain both substrings, but with all of the example's given to him/her, surely s/he is able to adjust it to his/her needs!
; )
Yeah, you are probably right. One more argument to go for the strpos() check instead of trying to make the regex do all the work. Why? Because you don't know for sure which part comes first within the query string: task or itemid? The regex would have to check both possibilities. Let me just quickly try to cook something up (not tested), just as an example of how ugly it gets.