Page 3 of 3

Posted: Wed Jul 13, 2005 8:59 am
by Roja
bokehman wrote: Here it is in a function. In places it may seem like there is unnecessary code but this is because different mailservers give diffent response and this needs to be taken into consideration.
I think honestly the second half of the script is unneccesary. In theory, it is meant to test whether that specific user address exists, but in reality, it will be fairly unreliable.

- Many virtual hosting companies use a catchall setting ("yes" to any @domain.com), and sort it after the fact
- Many mail servers use milters or addons that accept all @domain.com SMTP transactions, and reject non-matches *after* the fact with a bounce (because spammers do exactly what your script does)

To be fair, the top half of the script is a fairly nice checkdns/mxrr function, but the bottom half is going to half enough false positives as to be unreliable. Its basically an fsockopen/manual implementation of VRFY, to avoid the fact that everyone blocks VRFY (heh!), which many spammers already do.

So, nice for the top half, but I wouldn't bother with the bottom half. :)

Posted: Wed Jul 13, 2005 9:12 am
by bokehman
The bottom half doesn't prove an email exists but it does prove that the remote server will accept mail for it.

On the other hand if it returns false you know for sure the email address will be rejected.

On my business site I use it to throw a warning if validation fails but to do nothing on non-failure.

Since the check is done in the background nothing is lost in time by doing the additional check.

Also to be quite honest anyone who has a catchall is crazy. Why would anyone want to waste their bandwidth receiving spam which will only go straight in the bin.

Posted: Wed Jul 13, 2005 9:22 am
by Roja
bokehman wrote:The bottom half doesn't prove an email exists but it does prove that the remote server will accept mail for it.
Correct. Which is very different from "This email is valid". Since you called it a validation function, I thought that should be mentioned.
bokehman wrote: Also to be quite honest anyone who has a catchall is crazy. Why would anyone want to waste their bandwidth receiving spam which will only go straight in the bin.
Not at all, and your script is the exact example of why it is done. If no one did so, it WOULD be a definitive way to determine if an email address exists. Because spammers started doing just that, the mail server developers had to come up with a solution. The response? Accept all, and AFTER accepting, then reject mails that aren't appropriate.

Also, it shifts the time impact back onto the spammer. By rejecting immediately, the spammer can script everything without having to check their mail. By delaying th rejection until after the accept, the spammer is forced to either check their mail, or accept a much lower "success" rate.

Its not crazy at all. It makes a ton of sense, and its widely used for a reason.

Beyond that good reason, many people like to use throwaway accounts - spamfromslashdot@example.com becomes VERY handy for figuring out where spam will come from, and for people (like me) that own a domain, its a very nice tactic for simple filtering.

Posted: Wed Jul 13, 2005 9:47 am
by bokehman
That is very strange logic and certainly not the logic of companies like hotmail and aol that reject mail for mailboxes that don't exist.

Also there is no way a spammer could find email addresses by brute force against a mailserver. That is utter twaddle. Here is an example: Lets say an email address has 8 characters before the'@'. The legal characters are a-z (26 characters) 0-9 (10 characters) and '.' and '-' (2 characters) total 38 characters. Now add the power 8 for a length of 8 characters and the result is 4.3 thousand billion.

And that is just to crack the 8 figure email addresses at 1 measly domain.

Posted: Wed Jul 13, 2005 11:16 am
by Roja
bokehman wrote:That is very strange logic and certainly not the logic of companies like hotmail and aol that reject mail for mailboxes that don't exist.
I've worked at a national ISP, at a international network provider, and ran my own webhosting company.

I assure you, it is common.
bokehman wrote:Also there is no way a spammer could find email addresses by brute force against a mailserver. That is utter twaddle.
Perhaps you don't understand the scope of spamming today. Its over 60% of mail traffic on the internet by almost every study out there. They absolutely can and do brute force, however, they aren't nearly as simplistic as you make out.

They use known common username lists, use confirmed email addresses to determine likely patterns (firstname.lastname@corporate.example.com), trade lists with other spammers, and more. Its not about finding email addresses, its about confirming them - similar to your script.

My experience in multiple real-world situations is that spam is out there, and it dwarfs non-spam.

Be careful about writing off spammers as limited and uncommon.

Spam is the rule. Real mail is the exception. Once you understand and grasp that, then the logic behind the setup of mail servers makes much more sense.

[updated/edited: I realized that my wording was rather inflammatory, and it didn't need to be. I've edited it, and I apologize.]

Posted: Wed Jul 13, 2005 1:15 pm
by bokehman
Roja wrote: They use known common username lists, use confirmed email addresses to determine likely patterns (firstname.lastname@corporate.example.com)
Thats not a real brute force attack against a mailserver as there is also educated guesswork involved but obviously thats open to debate. What I really mean is it's not random interogation of a mailserver to find out email addresses.

As one of the checks on my mail server (which I host) when a remote machine connects its IP address is logged. If that machine tries to connect to a non-existant mailbox that is logged too. If the same IP tries to do this more than 10 times in a 24 hour period it is banned for a month. While it is banned, if it does try to connect it receives a stealth response (ie absolutely no response at all). I do lots of filtering too including SPF, PTR of the remote mailserver, address verification by connecting to the mailserver with authority for the sending email address etc.

Personally I would rather reject the spam rather than receive and delete it. Its not just my bandwidth it would waste if I were to accept it; it's the bandwidth of the internet as a whole.

Posted: Wed Jul 13, 2005 1:40 pm
by Roja
bokehman wrote:Thats not a real brute force attack against a mailserver
It is a brute force attack. They send thousands of tries a day to validate email addresses. Against a major ISP, that literally isn't even noticable by someone watching the logs realtime. Its also against a mailserver.
bokehman wrote:What I really mean is it's not random
Correct - Which in fact makes it worse.. with a random attack, it would take them much longer, and be more noticable.
bokehman wrote:interogation of a mailserver to find out email addresses.
Right - not discovery, validation.
bokehman wrote: Personally I would rather reject the spam rather than receive and delete it. Its not just my bandwidth it would waste if I were to accept it; it's the bandwidth of the internet as a whole.
Sure, and that goes directly to my point. If mail servers all responded consistently to the process your script uses, spam would increase. Thats why they don't do so, thats why its bad to rely on it in a script, and thats why I commented.

But once again, I repeat my first response: The top half of the script rocks!