fopen works on the first few web pages and then fails!
Posted: Sat Jan 16, 2010 9:30 pm
Hello,
I am writing to harvest the content of the Apple user forum for a research I am doing.
I am using fopen to open a series of web pages. This works for the first couple of web pages and then stops displaying the content for the remaining pages. Below is the detailed description.
The url for the forum with the listing of the threads is: http://discussions.apple.com/forum.jspa ... 34&start=0. The last number (i.e. 0) is then incremented by 15 to move to the following page of threads (http://discussions.apple.com/forum.jspa ... 4&start=15) etc...
I am trying to open the pages one at a time, read the content and then locate the information I need.
My partial code looks like this:
$counter=0;
while ($counter<1000) //1000 is just an arbitrary number
{
echo $subforum_url = "http://discussions.apple.com/forum.jspa ... 334&start=" . $counter;
$posthandle = fopen($subforum_url, "r");
$i = 0;
$postcontents = '';
if ($posthandle) {
while (!feof($posthandle)) {
$postcontents .= fgets($posthandle, 8192);
echo $postcontents;
echo "<br />";
$i++;
}
}
$counter+=15;
}
this is followed by a code that matches the content for the title of threads, the date it was posted ...
My problem is that everything works very well for the first couple of pages. I can read the content and then all the preg_match code works.
After that all my code is able to do is to echo the url and it doesn't seem to open any content.
Do you know what's going on?
Any work around?
Thank you so much.
I am writing to harvest the content of the Apple user forum for a research I am doing.
I am using fopen to open a series of web pages. This works for the first couple of web pages and then stops displaying the content for the remaining pages. Below is the detailed description.
The url for the forum with the listing of the threads is: http://discussions.apple.com/forum.jspa ... 34&start=0. The last number (i.e. 0) is then incremented by 15 to move to the following page of threads (http://discussions.apple.com/forum.jspa ... 4&start=15) etc...
I am trying to open the pages one at a time, read the content and then locate the information I need.
My partial code looks like this:
$counter=0;
while ($counter<1000) //1000 is just an arbitrary number
{
echo $subforum_url = "http://discussions.apple.com/forum.jspa ... 334&start=" . $counter;
$posthandle = fopen($subforum_url, "r");
$i = 0;
$postcontents = '';
if ($posthandle) {
while (!feof($posthandle)) {
$postcontents .= fgets($posthandle, 8192);
echo $postcontents;
echo "<br />";
$i++;
}
}
$counter+=15;
}
this is followed by a code that matches the content for the title of threads, the date it was posted ...
My problem is that everything works very well for the first couple of pages. I can read the content and then all the preg_match code works.
After that all my code is able to do is to echo the url and it doesn't seem to open any content.
Do you know what's going on?
Any work around?
Thank you so much.