Computer: Email: Spam: Avoidance
Technical Spam Solution Based on Whitelists+Sender ID   (+2, -1)  [vote for, against]
A reasonable, technical solution to end spam

Ok, we all get spam, we're all sick and tired of it, and we want it to end. Bayesian filters ameliorate the problem but tend to fail lately due to generated entropy, and SpamAssassin solutions do a pretty good job but generate false positives -- so there goes pattern recognition of any kind. Economical and legal solutions are not feasible, we don't know who the spammers are in the first place. So we're back to square one.

There are two essential problems with spam: (1) we don't know if the person who sent it is a spammer by looking at the sender's e-mail address, and (2) even if we do recognize the e-mail, it can be spoofed.

My proposed solution is essentially a combination of an automated white-list mechanism (for #1) and a sender authentication mechanism (for #2). The beauty is that it wouldn't disrupt e-mail as we know it in the process.

Fix for problem #1: use a known list of e-mail addresses and filter out everything else. "Everything else" gets an automated reply directing the sender to an URL which uses CAPTCHA (search Google if you don't know what that is) and if passed, adds the sender to the "valid" addresses. This has been done, and it works pretty well -- except it doesn't detect spoofs.

Fix for problem #2: whenever you send an e-mail, your (outgoing) server registers a unique ID for that message internally (generating the ID is common practice, I don't know if they are also stored somewhere on the server.) Whenever the recipient's (incoming) server receives a message, it holds it in limbo for a configurable maximum duration of time (probably some 2-3 minutes), and sends out a request for confirmation message back to your server. This message would contain a tag in the header identifying it as a request for confirmation (e.g. X-Message-Type: request for confirmation), along with the message ID -- this way, servers who "know" about this won't forward you these messages, and will know how to deal with them. Your ("knowing") server then has to send the confirmation itself, which is another message, identified as confirmation (e.g. X-Message-Type: confirmation), along with the original message ID. If a spammer sends a message to your spouse using your e-mail address, then your server sends out an infirmation message (e.g. X-Message-Type: infirmation) along with the message ID. Depending on whether the message was confirmed, infirmed or the confirmation timed out, a new tag would be added in the header by the recipient's mail server (e.g. X-Message-Confirmation: confirmed, or "infirmed", or "timed out".) After this, the message would be finally delivered to the receiver's MUA (Mail User Agent, like Mozilla or Outlook.) The MUA would decide what to do next, based on the receiver's configuration (keep it, discard it, store it in a folder for later review, whatever.)

If bandwidth is your concern, don't worry. The number of legitimate messages would triple, it's true, but the size of the two confirmation/infirmation messages is really small. And if you factor in that this might be the end of spam as we know it, I think the balance is definitely on the plus.

If the programming effort is your concern, then it shouldn't be: this solution is extremely simple to implement by all major mail server software providers.

The thing is effective on more than one level: your critical business partners will probably install a new version of the mail server software, so you will get the critical messages confirmed pretty soon; once 10% of the servers use this mechanism, the rest of 90% will quickly follow because the request for confirmation messages will get annoying really soon (a server which doesn't know how to deal with this will actually forward the request for confirmation message to the sender himself.)

I welcome absolutely any messages that criticize this idea, that way we could hopefully reach a better possible solution, if this proves to have holes.

This is the first time I actually hope my idea is baked.
-- gutza, Oct 24 2004

Wikipedia: Sender ID http://en.wikipedia.org/wiki/Sender_ID
[jutta, Oct 24 2004]

Meng Weng Wong: SPF http://spf.pobox.com/
[jutta, Oct 24 2004]

Microsoft: Caller ID http://www.microsof...pamTechVisionPR.asp
[jutta, Oct 24 2004]

"Just a quick reminder, even if you send mail from different computers, or using a laptop to send messages from different places, all legitimate messages generally go through the same outgoing mail server."

That turns out to not be true for the general case.

There are a number of approaches out there that try to authenticate the sending server. For example, SPF (Meng Weng Wong) and DomainKeys (Yahoo) and Sender ID (IETF marid).

"The thing is effective on more than one level: it bombards spammers back with "request for confirmation" messages (bad for their Internet bill)"

Actually, the reqeuests for confirmation would go to the fake sender address (the same place where today the bounce messages go.)
-- jutta, Oct 24 2004


"That turns out to not be true for the general case." Why do you say that, one typically connects to the same POP/IMAP server to send mail, isn't that true?
-- gutza, Oct 24 2004


One does not connect to a POP/IMAP server to send mail. One connects to an SMTP server.

SMTP's store-and-forward nature has caused architectures to evolve (I'm phrasing this carefully here) that do all kinds of weird things. There is e-mail that gets changed in interesting ways, forwarded in places that don't quite know who the sender is, digested, resent, expanded, mailed to aliases - it's very easy to come up with spam stopper approaches that work for the private ISP customer with one mail server, one IMAP/POP server, and a private key, but very, very difficult to come up with something that works for everybody.

The existing sender domain authentication approaches all try to address this in various ways, and it may be interesting for you to read up on the details there.
-- jutta, Oct 24 2004


Correct, it's SMTP, but it's still the same server as far as I know -- otherwise you're connecting to open relays of sorts, no?
-- gutza, Oct 24 2004


No. For example, I move back and forth between two different mail servers that know me and forward my messages. They're not open relays; they just know me.

A travelling businessperson might use the mail server of their company when they're at work, a different server when they're logged in from home, and the server of their hotel when they're travelling.
-- jutta, Oct 24 2004


Thanks for the links and the explanations, they all make sense. I've been reading about various proposed solutions here and there, but I hadn't really looked into the matter very deep so far. So basically I didn't know where to look for the current progress. So, thank you!
-- gutza, Oct 24 2004


[gutza], you might also like to look at FOAF identification for a socail network authentication idea.
-- neilp, Oct 25 2004


Jutta,

If the problem with stopping spam rests with SMTP's architecture, would a webmail only spam proof system be possible?
-- doomsayer, Aug 23 2008



random, halfbakery