Page 1 of 1

Detect and react to referer spammers?

Posted: Tue May 15, 2007 11:02 pm
by JAB Creations
Is there a way when a referer spammer (a spammer who sends false referers in hopes that you have a public stats script that they can easily get their URL in to) to hold the request, visit the referer URL, and see if a reciprocal anchor exists and if/else decide what to do then?

Posted: Tue May 15, 2007 11:07 pm
by maliskoleather
you could, but the weight and time delay that would put on the server would make it completley pointless.

Posted: Tue May 15, 2007 11:13 pm
by JAB Creations
Cool, how could I achieve this pointless goal then? :twisted:

Posted: Tue May 15, 2007 11:27 pm
by volka
What if the user uses a bookmark? -> no referer
What if the link is on a page that requires a login?
What if the link isn't present as the exact string you're looking for?
... many other "What if"s ...
As soon as a determined spammer realizes what you're doing he/she will send "valid" referers, no sweat.

Posted: Tue May 15, 2007 11:34 pm
by JAB Creations
I have to get used to the fact that everyone assumes they know what I'd actually do with the result.

As always: Please help me figure out how to do this or please don't waste time your time reminding me of things I already am aware of but can not figure out how to politely put it in my signature because everyone has to PM me about it.

Posted: Tue May 15, 2007 11:40 pm
by maliskoleather
essentally, you need to to load that page's content (fopen or get_file_contents works on some sites, but you probably want to lookinto cURL), then scan the contents with preg_match... if true, continue on...

Code: Select all

$handle = fopen('http://www.foo.com', 'r');
$data = fread($handle);

$res = preg_match("/www.mysite.com/",$data);

if(!$res){
    die('we do not allow referrer spamming');
}
thats just a basic example though... like i said, fopen only works on some URL's... cURL would be a much better method.

keep in mind this will *drastically* slow the script you implement this on down.. you have to wait for the script to load the page, and then scan the whole file... could easily add 20+ seconds in load/processing time... even more if your site has high traffic.

and then there are so many other issues... if the page requires logins.. if the page youre scanning blocks the php user agent... all kinds of things that make it so i really cant see the point in even TRYING this