Call up a .PHP file from .html

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
boujin
Forum Newbie
Posts: 8
Joined: Sat Feb 11, 2006 4:52 am

Call up a .PHP file from .html

Post by boujin »

Sami | Please use

Code: Select all

and

Code: Select all

tags where appropriate when posting code. Read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url][/color]


I’m trying to use a user-agent blocker using PHP in yahoo web hosting. Unfortunately Yahoo won’t let me use the following script inside an .html file (unless I change all my web site from .html to .php, which I won't do) to call up my PHP file robots.php (nor do they let me have a .htaccess file either): 

<?PHP include "/robots.php"; ?>

So, I tried using Javascript instead for which I do receive the email warning me that a blocked useragent has accessed my web but e403.html does not show up afterwards.

<script language="JavaScript" type="text/javascript" src="/robots.php">
</script>

So what can i do so that e403.html will show up after i get the email?

This is the content of robots.php (whithout the 2 dashed lines):

------------------------------------------------------

Code: Select all

<?php

     $browser = array ("^crescent",

"wbdbot",
"Web Downloader",
"webauto",
"webbandit",
"WebCapture",
"webcollector",
"WebCopier",
"webdevil",
"WebEMailExtrac.*",
"WebFetch",
"webfetcher",
"WebFountain",
"webhook",
"webminer",
"WebMirror",
"webmole",
"WebReaper",
"WebSauger",
"WebSense",
"website",
"websnake",
"Webster",
"WebStripper",
"websucker",
"webweasel",
"WebWhacker",
"WebZIP",
"Wget",);

     $punish = 0;
     while (list ($key, $val) = each ($browser)) {
          if (strstr ($HTTP_USER_AGENT, $val)) {
               $punish = 1;
          }
     }

     if ($punish) {

          $msg .= "robots.php detected the following banned browser agent errors:\n";
		  $msg .= "Host: $REMOTE_ADDR\n";
          $msg .= "Agent: $HTTP_USER_AGENT\n";
          $msg .= "Referrer: $HTTP_REFERER\n";
          $msg .= "Document: $SERVER_NAME" . $REQUEST_URI . "\n";
        $headers .= "X-Priority: 1\n";
        $headers .= "From: Robots.php <pfs@pfs.net>\n";
        $headers .= "X-Sender: <pfs@pfs.net>\n";
          mail ("pfs@pfs.net", "robots.php BANNED BROWSER
AGENT ERROR from pfs@pfs.net", $msg, $headers

);

include "/e403.html"; 

          exit;
     }

?>
------------------------------------------------------------------


Sami | Please use

Code: Select all

and

Code: Select all

tags where appropriate when posting code. Read:  [url=http://forums.devnetwork.net/viewtopic.php?t=21171]Posting Code in the Forums[/url][/color]
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

The problem is with linking to an external php file. When you create a link in an HTML file, eg:

<script language="JavaScript" type="text/javascript" src="/robots.php">
</script>

.. the user's browser requests that file after it's downloaded the HTML. It then receives whatever the output of robots.php is .. but it'll never display it. Html link elements are for things like javascript and css .. not HTML. If you don't have access to .htaccess and your can't put code in your .html files then the only option left is to convert all your files to be .php.

Alterntively you could move to a better hosting company..
nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

um... it seems that you are intending to block a bunch of robots from accessing your site... try this: http://www.searchengineworld.com/robots ... torial.htm

you dont need php to do something that is built into the server...
boujin
Forum Newbie
Posts: 8
Joined: Sat Feb 11, 2006 4:52 am

Post by boujin »

They are web downloaders, spammers and similars. I don't really mind about the robots since yahoo gives me 400 GB worth each month!. I read your link but the robots.txt protocol will only be followed by "good" robots not "bad robots" so it doesn't apply to downloaders and spammers.

Isn't there any other way of banning these user-agents with some other computer language? maybe .net?

How about other top web hosting companies? does microsoft offer web hosting?
Last edited by boujin on Sun Feb 12, 2006 3:50 am, edited 1 time in total.
nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

If your host doesnt support php, it's doubtful that they would support any server side scripting... The best suggestion would be to switch hosts...
boujin
Forum Newbie
Posts: 8
Joined: Sat Feb 11, 2006 4:52 am

Post by boujin »

It does actually support PHP when it is called from .html with <form METHOD="POST" to send emails but when I use the php script mentioned in my first email it won't call up e403.html.

It also supports SSI and works fine.

The problem seems to be placing PHP script inside .html
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Post by AKA Panama Jack »

boujin wrote:They are web downloaders, spammers and similars. I don't really mind about the robots since yahoo gives me 400 GB worth each month!. I read your link but the robots.txt protocol will only be followed by "good" robots not "bad robots" so it doesn't apply to downloaders and spammers.

Isn't there any other way of banning these user-agents with some other computer language? maybe .net?

How about other top web hosting companies? does microsoft offer web hosting?
Actually it should still block those robots...

Place this inside your main directory and name it robots.txt

Code: Select all

User-agent: ^crescent
Disallow: /

User-agent: wbdbot
Disallow: /

User-agent: Web Downloader
Disallow: /
 
User-agent: webauto
Disallow: /

User-agent: webbandit
Disallow: /

User-agent: WebCapture
Disallow: /

User-agent: webcollector
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: webdevil
Disallow: /

User-agent: WebEMailExtrac.*
Disallow: /

User-agent: WebFetch
Disallow: /

User-agent: webfetcher
Disallow: /

User-agent: WebFountain
Disallow: /

User-agent: webhook
Disallow: /

User-agent: webminer
Disallow: /

User-agent: WebMirror
Disallow: /

User-agent: webmole
Disallow: /

User-agent: WebReaper
Disallow: /

User-agent: WebSauger
Disallow: /

User-agent: WebSense
Disallow: /

User-agent: website
Disallow: /

User-agent: websnake
Disallow: /

User-agent: Webster
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: websucker
Disallow: /

User-agent: webweasel
Disallow: /

User-agent: WebWhacker
Disallow: /

User-agent: WebZIP
Disallow: /

User-agent: Wget
Disallow: /
This should keep all of them out of your site.
sheila
Forum Commoner
Posts: 98
Joined: Mon Sep 05, 2005 9:52 pm
Location: Texas

Post by sheila »

boujin is right. Reading and obeying robot.txt is voluntary and "bad robots" simply ignore it.
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

sheila wrote:boujin is right. Reading and obeying robot.txt is voluntary and "bad robots" simply ignore it.
Since when has wget been bad?
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

onion2k wrote:
sheila wrote:boujin is right. Reading and obeying robot.txt is voluntary and "bad robots" simply ignore it.
Since when has wget been bad?
Hmm yeah. If wget was blocked by a lot of sites I'd be pretty annoyed :lol: I use it a lot.

I can see how it could be annoying to certain web hosts though since it will spider recursively if you tell it to.
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

onion2k wrote:
sheila wrote:boujin is right. Reading and obeying robot.txt is voluntary and "bad robots" simply ignore it.
Since when has wget been bad?
wget by default respects the robot.txt file. You can tell it to do otherwise, but thats the *person* being bad, not the program. :)
User avatar
bimo
Forum Contributor
Posts: 100
Joined: Fri Apr 16, 2004 11:18 pm
Location: MD

Post by bimo »

If you put a(n) .htaccess file with the line

Code: Select all

AddType application/x-httpd-php .php .html
in your home directory, that should work unless yahoo doesn't allow .htacess files.
Post Reply