How to prevent my website being retrieved by someone else ?
Moderator: General Moderators
- christian_phpbeginner
- Forum Contributor
- Posts: 136
- Joined: Sat Jun 03, 2006 2:43 pm
- Location: Java
How to prevent my website being retrieved by someone else ?
Hi, could you please help me how can I prevent someone else using file_get_contents() to retrieve my page ?
Thanks a lot,
Chris
Thanks a lot,
Chris
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
- christian_phpbeginner
- Forum Contributor
- Posts: 136
- Joined: Sat Jun 03, 2006 2:43 pm
- Location: Java
Hi everybody...
I was just asking because I found some website which can't be retrieved with file_get_contents(), what I was going to ask actually...how do you 'file_get_contents()' website like that ?
For example this website below can't be retrieved:
http://www.goalzz.com/main.aspx?region= ... pdate=true
I wonder why ?
Thanks !
I was just asking because I found some website which can't be retrieved with file_get_contents(), what I was going to ask actually...how do you 'file_get_contents()' website like that ?
For example this website below can't be retrieved:
http://www.goalzz.com/main.aspx?region= ... pdate=true
I wonder why ?
Thanks !
- feyd
- Neighborhood Spidermoddy
- Posts: 31559
- Joined: Mon Mar 29, 2004 3:24 pm
- Location: Bothell, Washington, USA
They've chosen to basically be jerks and filter what user-agent's they will allow. Using cURL, I can easily get the page. The following works as well.
Code: Select all
[feyd@home]>php -r "ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6'); var_dump(get_headers('http://www.goalzz.com/main.aspx?region=-1&area=6&update=true'));"
array(11) {
[0]=>
string(15) "HTTP/1.1 200 OK"
[1]=>
string(17) "Connection: close"
[2]=>
string(35) "Date: Wed, 09 Aug 2006 18:16:32 GMT"
[3]=>
string(25) "Server: Microsoft-IIS/6.0"
[4]=>
string(21) "X-Powered-By: ASP.NET"
[5]=>
string(26) "X-AspNet-Version: 1.1.4322"
[6]=>
string(62) "Set-Cookie: ASP.NET_SessionId=cpbavi551sri2w453kvvrx45; path=/"
[7]=>
string(22) "Cache-Control: private"
[8]=>
string(38) "Expires: Tue, 09 Aug 2005 18:16:32 GMT"
[9]=>
string(45) "Content-Type: text/html; charset=Windows-1252"
[10]=>
string(21) "Content-Length: 78043"
}- christian_phpbeginner
- Forum Contributor
- Posts: 136
- Joined: Sat Jun 03, 2006 2:43 pm
- Location: Java
feyd wrote:They've chosen to basically be jerks and filter what user-agent's they will allow. Using cURL, I can easily get the page.
Hi feyd....
Thanks for the great info, I'm downloading the libcurl now, later I'll read the docs to install it.
Chris
Last edited by christian_phpbeginner on Wed Aug 09, 2006 2:22 pm, edited 1 time in total.
- christian_phpbeginner
- Forum Contributor
- Posts: 136
- Joined: Sat Jun 03, 2006 2:43 pm
- Location: Java
Hi Ninja,The Ninja Space Goat wrote:is that where you live... in ur avatar? (Sorry... off-topic, but that's cool if it is)
Thank you, but I don't live there. That's my cabin, it's not a fancy and expensive one actually, because we built it by our own hands. I am missing it now, because I am not in my cabin. Can't wait for the next summer...
Last edited by christian_phpbeginner on Wed Aug 09, 2006 2:32 pm, edited 1 time in total.
Re: How to prevent my website being retrieved by someone else ?
Opening up an old thread.
I tried curl and the ini_set for generic user agents, but I cannot seem to be able to retrieve the contents of the following url - any clues appreciated !!
http://www.petitscailloux.com/Follow.as ... detail.htm
Thanks a lot !
I tried curl and the ini_set for generic user agents, but I cannot seem to be able to retrieve the contents of the following url - any clues appreciated !!
http://www.petitscailloux.com/Follow.as ... detail.htm
Thanks a lot !
Re: How to prevent my website being retrieved by someone else ?
No problem for me. all is fineidy wrote:Opening up an old thread.
I tried curl and the ini_set for generic user agents, but I cannot seem to be able to retrieve the contents of the following url - any clues appreciated !!
http://www.petitscailloux.com/Follow.as ... detail.htm
Thanks a lot !
Re: How to prevent my website being retrieved by someone else ?
Barzouk - would you mind posting the code you used ?
Thanks a lot !
Thanks a lot !
Re: How to prevent my website being retrieved by someone else ?
idy wrote:Barzouk - would you mind posting the code you used ?
Thanks a lot !
Try it again and see what message you get
Re: How to prevent my website being retrieved by someone else ?
OK I tried :
and the result was :
which got me the following :
Code: Select all
$url = "http://www.petitscailloux.com/Follow.aspx?sUrl=http://www.seloger.com/199986/16271207/detail.htm";
ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6');
echo file_get_contents($url);
I then tried :file_get_contents(http://www.petitscailloux.com/Follow.as ... detail.htm) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 500 Internal Server Error
Code: Select all
$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, 'http://www.petitscailloux.com/Follow.aspx?sUrl=http://www.seloger.com/199986/16271207/detail.htm');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
echo $file_contents = curl_exec($ch);
curl_close($ch);Runtime Error
Description: An application error occurred on the server. The current custom error settings for this application prevent the details of the application error from being viewed remotely (for security reasons). It could, however, be viewed by browsers running on the local server machine.
Details: To enable the details of this specific error message to be viewable on remote machines, please create a <customErrors> tag within a "web.config" configuration file located in the root directory of the current web application. This <customErrors> tag should then have its "mode" attribute set to "Off".
<!-- Web.Config Configuration File -->
<configuration>
<system.web>
<customErrors mode="Off"/>
</system.web>
</configuration>
Notes: The current error page you are seeing can be replaced by a custom error page by modifying the "defaultRedirect" attribute of the application's <customErrors> configuration tag to point to a custom error page URL.
<!-- Web.Config Configuration File -->
<configuration>
<system.web>
<customErrors mode="RemoteOnly" defaultRedirect="mycustompage.htm"/>
</system.web>
</configuration>