scraping data

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
qadeer_ahmad
Forum Newbie
Posts: 11
Joined: Thu Aug 14, 2008 3:17 am

scraping data

Post by qadeer_ahmad »

Hi all developers,
Most of our projects have scrap data related we use different ways commonly Curl / preg_match but all is depends on html source so if some one change the html structure change then code stop working.
Is there any perfect solution of this issue?

Thanks
User avatar
Darhazer
DevNet Resident
Posts: 1011
Joined: Thu May 14, 2009 3:00 pm
Location: HellCity, Bulgaria

Re: scraping data

Post by Darhazer »

Yes, and it's called Human :mrgreen:
qadeer_ahmad
Forum Newbie
Posts: 11
Joined: Thu Aug 14, 2008 3:17 am

Re: scraping data

Post by qadeer_ahmad »

:(
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: scraping data

Post by josh »

Hand craft a resilient regex. Or use highly "exclusive" pattern matching instead of parsing the tags (example a phone number anywhere is a phone number, doesn't matter what kinds of html tags it is wrapped around).

Hint: Come up with example pages that illustrate possible scenarios that you are worried about breaking your code? ... Then test your code against those example pages until your software proves it is robust enough not to break on them anymore. Then test some more.
qadeer_ahmad
Forum Newbie
Posts: 11
Joined: Thu Aug 14, 2008 3:17 am

Re: scraping data

Post by qadeer_ahmad »

Yes this can be idea we follow the [man]DOM[/man] to navigate to a specific point.
We can create function for all tags separately and these function handle all possible condition. That can be a library to handle such things.
Post Reply