Scraping HTML, yuk

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Scraping HTML, yuk

Postby $var » Mon Feb 08, 2010 5:36 pm

Howdy, I have been tasked with my first HTML scrape and am a bit foggy about how to do it.
In short, I need to take 1 of 3 tables out, and part of some content within a table.

Here is what I'm working with, I need to keep everything below the white 'Canada...' headline:
http://64.246.64.33/merge/tsnform.aspx? ... index.aspx

I tried and failed at using strpos to grab the table.

How would YOU do this?

As always, I am humbled by the wit and skill of DevNet.
User avatar
$var
Forum Contributor
 
Posts: 309
Joined: Thu Aug 18, 2005 8:30 pm
Location: Toronto

Re: Scraping HTML, yuk

Postby John Cartwright » Mon Feb 08, 2010 9:59 pm

I would not, because it is against their Terms of Use.

A quick except of relevant terms usage:

You may not transmit or send messages, inquiries, scripts, "spiders," automated query programs, web crawlers, robotic programs, robots, or other similar devices to the Website or its associated server, or otherwise use or access, electronically or manually, this Website or its associated server, along or with others, in any manner which: (i) "scrapes," copies, collects, stores, transmits or reproduces any Materials or data displayed on the Website;
Code: Select all
if ($toBe || $notToBe) echo 'That is the question'

ATTENTION: Please read the Forum Rules, and take the Forum Tour before posting!
User avatar
John Cartwright
Extreme Guru Moderator
 
Posts: 10583
Joined: Tue Dec 23, 2003 3:10 am
Location: Toronto

Re: Scraping HTML, yuk

Postby $var » Tue Feb 09, 2010 10:18 am

Hmm... well, I know we're partners and scrape other feeds. But I wouldn't want anyone to get in trouble for assisting me with this one.
Thanks anyway.
User avatar
$var
Forum Contributor
 
Posts: 309
Joined: Thu Aug 18, 2005 8:30 pm
Location: Toronto

Re: Scraping HTML, yuk

Postby John Cartwright » Tue Feb 09, 2010 2:22 pm

$var wrote:Hmm... well, I know we're partners and scrape other feeds. But I wouldn't want anyone to get in trouble for assisting me with this one.
Thanks anyway.


If that is true.. why don't they give you an RSS feed or something designed for this kind of thing?
Code: Select all
if ($toBe || $notToBe) echo 'That is the question'

ATTENTION: Please read the Forum Rules, and take the Forum Tour before posting!
User avatar
John Cartwright
Extreme Guru Moderator
 
Posts: 10583
Joined: Tue Dec 23, 2003 3:10 am
Location: Toronto

Re: Scraping HTML, yuk

Postby $var » Tue Feb 09, 2010 3:37 pm

My only guess is that the provider is very far behind in their technology and don't have simple feeds available for syndication of this type.
Problem solved anyhow, no scraping involved.
User avatar
$var
Forum Contributor
 
Posts: 309
Joined: Thu Aug 18, 2005 8:30 pm
Location: Toronto


Return to PHP - Code

Who is online

Users browsing this forum: crammy15b, MSN [Bot], supercabbage, Yahoo [Bot] and 2 guests