Hy,
I wonder is there a script that would pull data from some page which is updated every few hours. Data are always in the same table, so I was thinking i would tell the program at which part table or text begins, and ends, and than it would save it in txt file, which I could use.
Thanks, G
Is there a script to do that ?
Moderator: General Moderators
Re: Is there a script to do that ?
Scraping makes the baby bunnies' cry.
Re: Is there a script to do that ?
Yes, this can easily be done with curl and preg_match.
Here are some good examples to check out on how to use the two libs:
http://www.php.net/manual/en/book.curl.php
http://www.php.net/manual/en/book.pcre.php
Here are some good examples to check out on how to use the two libs:
http://www.php.net/manual/en/book.curl.php
http://www.php.net/manual/en/book.pcre.php
- Kieran Huggins
- DevNet Master
- Posts: 3635
- Joined: Wed Dec 06, 2006 4:14 pm
- Location: Toronto, Canada
- Contact:
Re: Is there a script to do that ?
simpleXML might be a better tool than regex in this case.
Re: Is there a script to do that ?
Thanks, but is there any program to do that or I have to learn code, or hire somebody do that for me ?
Re: Is there a script to do that ?
I have found this simple tutorial:
http://www.oooff.com/php-scripts/basic- ... arsing.php
but don't know why it doesnt work, looks like this:
<?php
$data = file_get_contents('http://search.msn.com/results.aspx?q=site%3Afroogle.com');
$regex = '/1-10 of (.+?) results/';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];
?>
then I tried to pull title and put this together:
<?php
$data = file_get_contents('http://www.najdi.si');
$regex = '/<title> (.+?) </title>/';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];
?>
And doesnt work as well ??
http://www.oooff.com/php-scripts/basic- ... arsing.php
but don't know why it doesnt work, looks like this:
<?php
$data = file_get_contents('http://search.msn.com/results.aspx?q=site%3Afroogle.com');
$regex = '/1-10 of (.+?) results/';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];
?>
then I tried to pull title and put this together:
<?php
$data = file_get_contents('http://www.najdi.si');
$regex = '/<title> (.+?) </title>/';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];
?>
And doesnt work as well ??