Spam wars: greylisting
22 Jun 2005
in the early evening
Matt Winckler
Evidently there’s a new anti-spam player on the scene: greylisting. Here’s the concept: when a mail server receives a message, it records the IP address of the sending machine, the recipient email address, and the sender’s email address. The recipient server then immediately sends an error 451, message delayed, requesting that the sender try again later. Most properly-configured mailservers will requeue the message within a few minutes, or 4 hours at most, so the delay is not terribly significant. As long as the message is resent within 12 hours, and the information collected above matches, the email is allowed on to its intended recipient. Further communications are then allowed through without the challenge/response.
The theory behind this is that it’s not economical for spammers to retry sending mail, because they send hojillions of messages to a plethora of mailboxes that may or may not exist–and therefore they get tons of bounces. Resending all these bounces would take a lot of resources, and might not increase the spammers’ rate of return.
The problem is that this system doesn’t seem to be perfect. I’ve been trying to send an associate a message for a couple of days now, and all I get is the bounce–the message never goes through. This may be because my web host’s mail servers are configured incorrectly, or perhaps because the recipient mailserver is not set up to greylist properly, but either way it’s proof that this system is not foolproof against “friendly fire”. One of the most important things I see in spam filters is the quality of having absolutely minimal false positives. My priorities dictate that it’s all right to have some false negatives–that is, it’s ok if a filter ends up letting some junk mail through. But false positives, on the other hand–when good mail gets tagged as junk–are unacceptable. Greylisting seems to me to have a significantly higher false positive rate than something like a well-tuned Bayesian filter, which I can speak to from experience using Thunderbird’s.
Another problem I have with greylisting is that it seems to me like it would be trivially simple for spammers to analyze the bounced messages, grep them for error 451, and only resend those (as opposed to trying to resend every bounce they get). I think spamfighting is always going to be a cat-and-mouse game of catchup, but this particular “new weapon” for the good guys seems pretty weak as far as longevity goes.
