Good morning guys,
I want to get all <a> tags on a html page.
and including all characters between <a> & </a>
Any codes demo in REGEX or DOM please?
Sorry this is my first time to scrape a page.
Thank you VERY VERY VERY much your codes will be very very helpful to me, since this is my first and currently learning to scrape a html page.
GOD BLESS.
I want to get <a> tags, how?
Moderator: General Moderators
Re: I want to get <a> tags, how?
Could use some DOM class (eg, DOMDocument) then do a search by tag name.
Then there's always regular expressions. Might be better, hard to say.
Then there's always regular expressions. Might be better, hard to say.
Code: Select all
preg_match_all('#<a\s.*?</a>#is', $text, $matches);
print_r($matches[0]);- prometheuzz
- Forum Regular
- Posts: 779
- Joined: Fri Apr 04, 2008 5:51 am
Re: I want to get <a> tags, how?
Parsing (x)html should be done with an html parser. When running into improperly formed html, the regex might cause the entire html file to be incorrectly parsed while a true html parser can (an probably will) recover from those mistakes.tasairis wrote:Could use some DOM class (eg, DOMDocument) then do a search by tag name.
Then there's always regular expressions. Might be better, hard to say.Code: Select all
preg_match_all('#<a\s.*?</a>#is', $text, $matches); print_r($matches[0]);