Why Challenge/Response Systems
are the future of email
and the biggest threat that
spam has ever faced
Cedric Beust, July 9th 2004
I have resisted installing a Challenge/Response System (CRS) for the longest time. The reason is that I didn’t want to put too much burden on people who send me emails, and so far, Bayesian filtering had been doing a pretty good job at protecting me from spam.
Things have changed. Bayesian filtering is still doing a very good job, but the nature of spam is changing in subtle ways that make filtering less and less adequate each day. For example, I have noticed the spam I receive has more and more the following characteristics:
- They are very short (a couple of lines).
- They are not always in English (usually not a problem for Bayesian filtering as long as the message uses Unicode, since the filtering is based purely in the frequencies and proximity of words).
- They contain undisplayable characters (for example, Chinese or Russian, and you don’t have that language pack installed on your machine).
- Bayesian filters typically don’t add negative scores to emails carrying a virus payload. You need to complement them with another type of filtering tool, such as SpamAssassin or a virus removal software.
After a while, I realized that I was spending too much time checking out my junk email folder in the search of false positives (sign of a very bad Bayesian filter) or receiving actual spam in my inbox (sign that either the filtering level is too low, or that spam is getting harder to assess).
After this realization, I started reconsidering my view on Challenge/Response Systems, analyzing its drawbacks and advantages, and I think I have reached a decent compromise that should provide me with a close-to-optimal protection against spam.
This article describes my thoughts so far.
What makes spammers different
There is a fundamental difference between a spammer and you.
100% of the
messages a spammer writes are sent to unknown individuals,
whereas 99% of the emails you write are sent to people you know.
Think about this carefully because this simple observation is what provides us a deadly and final weapon against spam.
Unless you belong to a rare profession, I bet that most of the emails you send every day go to people who are either in your address book or in your inbox folder. Conversely, most of the emails that you receive come from a well-identified person.
How can we capitalize on this observation? By creating a CRS on your email account that respects these constraints.
But Challenge/Response Systems are a pain!
Yes and no. There are two different aspects we need to consider:
1) What makes them a pain?
2) Who are they a pain to?
1) Who suffers from Challenge/Response Systems?
Answer: people who email you for the first time.
Based on the observation above, we know that these people are very few in numbers.
If you follow the guideline for an effective CRS listed further down in this article, the CRS you install on your email account will be absolutely transparent for 99% of the people you correspond with regularly. I have installed such a system recently on my email account and I can guarantee you that nobody around me (coworkers, friends, family, temporary email pals, etc…) has even noticed.
So let’s go back to this 1% stranger who is trying to email you. He will receive a challenge in response to his email, and whether he will decide to send the email anyway or drop the idea of emailing you altogether depends on several factors:
- Is the email he is trying to send you very important?
- Is the Challenge too complicated, unclear or too time-consuming?
Obviously, we can’t do much about the first point since it’s entirely dependent on what this correspondent is trying to tell you, but we can address the second point by trying to create a CRS that people won’t mind responding to.
2) Why are current Challenge/Response Systems painful?
Because they make a fundamental mistake: they assume that spammers actually read the responses to the emails they send.
Most of the spammers use bogus or one-time-only email accounts. If you think about it, it makes sense: they are going to be deluged by mailer-daemon messages and angry people, so they are much better off ignoring these responses altogether.
Here is another often overlooked fact: the senders of spam are less and less the real originators of the spam.
There are hundreds of “spam powerhouses” that make a business of just sending bulk email. Whenever someone decides to resort to spam to sell their merchandise, they are typically going to hire the services of these bulk senders so they don’t have to worry about the technicalities (and the legal implications) of sending spam. This is one more reason why responses to the spam email account are never read.
Even if we assume that spammers do indeed read responses to their spam (maybe to add the email address of the unfortunate responder in a “validated email address lists”, another urban legend in my opinion), it purely and simply doesn’t make any economic sense to process it. The spammer is much better off letting his Web site handle orders or irate customers and focus on his next batch of ten million emails rather than adding your email address to a “golden list” of email addresses (which will most likely be protected by a spam filter anyway, so they’re not even sure that sending you a different spam will reach you).
The point I am getting at is this:
A Response doesn’t have to ask the recipient to do something clever.
No need to add a keyword to the subject, to go to a web site to confirm your identity or, even worse, to identify a distorted gif image so you can prove you are not a robot.
With that in mind, what would be the simplest action you could ask from a legitimate sender? Simply responding to the Response email. That’s right. A simple reply.
If you respond to the email, you are validated. Period.
Creating the ultimate Challenge/Response System
So here are my suggestions to create a very effective Challenge/Response System:
CRS must implement a white list. Anybody in this white list can send you email
directly without receiving a Challenge. I also recommend for the white list to
recognize patterns, so that you can add entire domains (such as your company,
e.g. *@bea.com). Optionally, you might want to
implement a black list containing email addresses that are automatically bounced
(by the way, the best way to do this is to simulate a Mailer-Daemon response).
turning the system on, populate the white list with your entire address book
and the content of your Inbox and various other folders of interest (and more
importantly, all the mailing-lists you are subscribed to). This is very
important so that your current email activities go undisturbed. For all the
people you communicate with on a regular basis, your installing of the CRS will
go absolutely unnoticed.
an email that is not whitelisted arrives in your inbox, you bounce it back to
the sender, adding a note saying that this account is protected against spam.
And here comes the important part: all they need to do is reply to that
email, and their original email will then be delivered.
4) The CRS must be able to deal with bounced emails gracefully. The simplest way to do this is to add a specific header to any Challenge email. Then, any time you receive an email that is either whitelisted or that contains the specific header, you forward it to your inbox. This header should also be used to avoid infinite loop between the CRS and the non-whitelisted sender.
Implementing the CRS
I implemented such a CRS with simple procmail rules and I am running it as we speak, but it’s too early to disclose its implementation since it most likely has quite a few bugs. There are less than fifty lines of procmail rules and the implementation includes a couple of external tools written in very simple shell to handle the white list.
Keep in mind that you can’t judge the effectiveness of a CRS by the amount of spam you are receiving, since by definition, it will be zero.
The only way you can know your system works is when you receive email from an unknown sender. An even better way to assess the effectiveness of your CRS is to check the logs regularly and:
- Identify when you received an email from a legitimate sender.
- But this sender didn’t respond to the challenge.
In such a case, I recommend contacting the sender directly and ask them why they didn’t respond to the challenge in order to determine how you could improve your system.
Of course, this kind of log-combing should be left to implementers of the CRS (me) and not to final customers, which is exactly what I intend to do in the coming weeks.
I will post a follow-up when I have more data on this Really Simple Challenge/Response System (RSCRS).
Back to my home page