Scraping HTML, yuk

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
$var
Forum Contributor
Posts: 317
Joined: Thu Aug 18, 2005 8:30 pm
Location: Toronto

Scraping HTML, yuk

Post by $var »

Howdy, I have been tasked with my first HTML scrape and am a bit foggy about how to do it.
In short, I need to take 1 of 3 tables out, and part of some content within a table.

Here is what I'm working with, I need to keep everything below the white 'Canada...' headline:
http://64.246.64.33/merge/tsnform.aspx? ... index.aspx

I tried and failed at using strpos to grab the table.

How would YOU do this?

As always, I am humbled by the wit and skill of DevNet.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Scraping HTML, yuk

Post by John Cartwright »

I would not, because it is against their Terms of Use.

A quick except of relevant terms usage:
You may not transmit or send messages, inquiries, scripts, "spiders," automated query programs, web crawlers, robotic programs, robots, or other similar devices to the Website or its associated server, or otherwise use or access, electronically or manually, this Website or its associated server, along or with others, in any manner which: (i) "scrapes," copies, collects, stores, transmits or reproduces any Materials or data displayed on the Website;
User avatar
$var
Forum Contributor
Posts: 317
Joined: Thu Aug 18, 2005 8:30 pm
Location: Toronto

Re: Scraping HTML, yuk

Post by $var »

Hmm... well, I know we're partners and scrape other feeds. But I wouldn't want anyone to get in trouble for assisting me with this one.
Thanks anyway.
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Scraping HTML, yuk

Post by John Cartwright »

$var wrote:Hmm... well, I know we're partners and scrape other feeds. But I wouldn't want anyone to get in trouble for assisting me with this one.
Thanks anyway.
If that is true.. why don't they give you an RSS feed or something designed for this kind of thing?
User avatar
$var
Forum Contributor
Posts: 317
Joined: Thu Aug 18, 2005 8:30 pm
Location: Toronto

Re: Scraping HTML, yuk

Post by $var »

My only guess is that the provider is very far behind in their technology and don't have simple feeds available for syndication of this type.
Problem solved anyhow, no scraping involved.
Post Reply