Page 1 of 1
mimicking "save as" of web browser
Posted: Fri Sep 11, 2009 6:19 am
by vlinet2002
I wish to save web pages programmatically as HTML files in my local machine. For e.g., I want to save
http://www.example.com/1
http://www.example.com/2
http://www.example.com/3
as 1.html, 2.html, 3.html.
==
I tried the following code, but it is not working.
Code: Select all
<?php
$file = fopen ("http://www.example.com/1", "r");
if (!$file) {
echo "<p>Unable to open remote file.\n";
exit;
}
while (!feof ($file)) {
// write to loal file
}
fclose($file);
?>
Appreciate the forum's help. Thank you.
Re: mimicking "save as" of web browser
Posted: Fri Sep 11, 2009 6:29 am
by Mirge
vlinet2002 wrote:I wish to save web pages programmatically as HTML files in my local machine. For e.g., I want to save
http://www.example.com/1
http://www.example.com/2
http://www.example.com/3
as 1.html, 2.html, 3.html.
==
I tried the following code, but it is not working.
Code: Select all
<?php
$file = fopen ("http://www.example.com/1", "r");
if (!$file) {
echo "<p>Unable to open remote file.\n";
exit;
}
while (!feof ($file)) {
// write to loal file
}
fclose($file);
?>
Appreciate the forum's help. Thank you.
Not to be rude, but saying "it is not working" is not helpful AT ALL.... please describe in detail the specific problem(s) you're running into.
Re: mimicking "save as" of web browser
Posted: Fri Sep 11, 2009 6:35 am
by vlinet2002
It gave an error message "Unable to open remote file. ".
I tried it with Firefox 3.0.14.
I have Apache2.2 server in my local machine.
I saved the file as "saveas.php" and tried the code as
http://localhost/saveas.php
Re: mimicking "save as" of web browser
Posted: Fri Sep 11, 2009 7:14 am
by Mark Baker
And where's your code for "// write to loal file"
Re: mimicking "save as" of web browser
Posted: Fri Sep 11, 2009 8:31 am
by Eric!
With your problem of opening the remote file: Are you sure the URL is correct? Go to that url in your browser and see if it really displays like that once the page is loaded. If the problem still exists once you get the real URL, check your php.ini and make sure allow_url_fopen is set to 1. You can check it in your code by echo ini_get('allow_url_fopen');
FYI - an easier way to get the URL contents is use file_get_contents
Code: Select all
$content = file_get_contents('http://www.google.com/');
if ($content !== false) {
// do something with the content
} else {
// an error happened
}
Your next problem is you have not written any code to actually save the file. You'll need to make either a javascript popup for Save As, or echo some HTML form where you can enter the name you want to save it as locally.
Re: mimicking "save as" of web browser
Posted: Sat Sep 12, 2009 3:43 am
by vlinet2002
Thank you for the tips. The following code worked.
Code: Select all
<?php
$content = file_get_contents('http://www.google.com/');
if ($content !== false) {
print (" contents found");
echo $content;
$outfile = fopen("googlelocal.html", 'w') or die("can't open file");
fwrite($outfile, $content);
fclose($outfile);
echo "Contents written to file";
} else {
print ("contents not found");
}
?>
By the way, it works for some sites, but not for all. For e.g., it
worked for the following sites.
http://edsitement.neh.gov/view_lesson_plan.asp?id=310
http://drupal.org/node/110224
It
did not work for the following
http://www.pallikalvi.in/SchoolReport.a ... 06,2,0,0,0
What could be the reasons?
Thank you.
Re: mimicking "save as" of web browser
Posted: Sat Sep 12, 2009 4:11 am
by markusn00b
file_put_contents() makes that a little less to write.
Re: mimicking "save as" of web browser
Posted: Sat Sep 12, 2009 1:37 pm
by Eric!
Some web sites prevent external scripts from parsing their contents by denying requests without a user-agent string.
You can open a socket connection and send the headers using a more complex method like fsockopen or cURL. There is a browser emulation class that has some features that work like fopen with URL, but I've not tried them
http://www.bitfolge.de/index.php?s=befopen
These are more involved techniques and require some knowledge about HTTP headers and debugging skills.