This is a (long overdue) reply to
Ilya's post:
SMPT -- Time to chuck it.
I'm going to quote it here, and reply to everything in it. Whenever I say
"you," I mean Ilya. So, with that said, let's get started.
E-mail, in particular SMTP (Simple Mail Transfer Protocol) has become an
integral part of our lives, people routinely rely on it to send files, and
messages. At the inception of SMTP the Internet was only accessible to a
relatively small, close nit community; and as a result the architects of
SMTP did not envision problems such as SPAM and sender-spoofing. Today, as
the Internet has become more accessible, scrupulous people are making use of
flaws in SMTP for their profit at the expense of the average Internet user.
Alright, this is pretty much the only thing I agree with.
There have been several attempts to bring this ancient protocol in-line with
the current society but the problem of spam keeps creeping in. At first
people had implemented simple filters to get rid of SPAM but as the sheer
volume of SPAM increased mere filtering became impractical, and so we saw
the advent of adaptive SPAM filters which automatically learned to identify
and differentiate legitimate email from SPAM. Soon enough the spammers
caught on and started embedding their ads into images where they could not
be easily parsed by spam filters.
A history lesson...still fine.
AOL (America On Line) flirted with other ideas to control spam, imposing
email tax on all email which would be delivered to its user. It seems like
such a system might work but it stands in the way of the open principles
which have been so important to the flourishing of the internet.
AOL (I believe Microsoft had a similar idea) really managed to think of
something truly repulsive. The postal system in the USA didn't always work
the way it does today. A long time ago, the recipient paid for the
delivery. AOL's idea seems a lot like that.
There are two apparent problems at the root of the SMTP protocol which allow
for easy manipulation: lack of authentication and sender validation, and
lack of user interaction. It would not be difficult to design a more
flexible protocol which would allow for us to enjoy the functionality that
we are familiar with all the while address some, if not all of the problems
within SMTP.
To allow for greater flexibility in the protocol, it would first be broken
from a server-server model into a client-server model.
This is first point I 100% disagree with...
That is, traditionally when one would send mail, it would be sent to a
local SMTP server which would then relay the message onto the next server
until the email reached its destination. This approach allowed for email
caching and delayed-send (when a (receiving) mail server was off-line for
hours (or even days) on end, messages could still trickle through as the
sending server would try to periodically resend the messages.) Todays mail
servers have very high up times and many are redundant so caching email
for delayed delivery is not very important.
"Delayed delivery is not very important"?! What? What happened to the whole
"better late than never" idiom?
It is not just about uptime of the server. There are other variables one
must consider when thinking about the whole system of delivering email.
Here's a short list; I'm sure I'm forgetting something:
- server uptime
- server reliability
- network connection (all the routers between the server and the "source")
uptime
- network connection reliability
It does little to no good if the network connection is flakey. Ilya is
arguing that that's rarely the case, and while I must agree that it isn't as
bad as it used to be back in the 80's, I also know from experience that
networks are very fragile and it doesn't take much to break them.
A couple of times over the past few years, I noticed that my ISP's routing
tables got screwed up. Within two hours of such a screwup, things returned
to normal, but that's 2 hours of "downtime."
Another instance of a network going haywire: one day, at
Stony Brook University, the internet
connection stopped working. Apparently, a compromised machine on the
university campus caused a campus edge device to become overwhelmed. This
eventually lead to a complete failure of the device. It took almost a day
until the compromised machine got disconnected, the failed device reset, and
the backlog of all the traffic on both sides of the router settled down.
Failures happen. Network failures happen frequently. More frequently that I
would like them to, more frequently than the network admins would like them
to. Failures happen near the user, far away from the user. One can hope
that dynamic routing tables keep the internet as a whole functioning, but
even those can fail. Want an example? Sure. Not that long ago, the well
know video repository YouTube
disappeared off the face of the Earth...well, to some degree. As this
RIPE NCC RIS case study
shows, on February 24, 2008, Pakistan Telecom decided to announce BGP routes
for YouTube's IP range. The result was, that if you tried to access any of
YouTube's servers on the 208.65.152.0/22 subnet, your packets were directed
to Pakistan. For about an hour and twenty minutes that was the case. Then
YouTube started announcing more granular subnets, diverting some of the
traffic back to itself. Eleven minutes later, YouTube announced even more
granular subnets, diverting large bulk of the traffic back to itself. Few
dozen minutes later, PCCW Global (Pakistan Telecom's provider responsible
for forwarding the "offending" BGP announcements to the rest of the world)
stopped forwarding the incorrect routing information.
So, networks are fragile, which is why having an email transfer protocol
that allows for retransmission a good idea.
Instead, having direct communication between the sender-client and the
receiver-server has many advantages: opens up the possibility for CAPTCHA
systems, makes the send-portion of the protocol easier to upgrade, and
allows for new functionality in the protocol.
Wow. So much to disagree with!
- CAPTCHA doesn't work
- What about mailing lists? How does the mailing list server answer the
CAPTCHAs?
- How does eliminating server-to-server communication make the protocol
easier to upgrade?
- New functionality is a nice thing in theory, but what do you want from
your mail transfer protocol? I, personally, want it to
transfer my email between where I send it from and where it is
supposed to be delivered to.
- If anything eliminating the server-to-server communication would cause
the MUAs to be "in charge" of the protocols. This means that at
first there would be many competing protocols, until one takes over
- not necessarily the better one (Betamax vs. VHS comes to
mind).
- What happens in the case of overzealous firewall admins? What if I
really want to send email to bob@example.com, but the firewall (for
whatever reason) is blocking all traffic to example.com?
Spam is driven by profit, the spammers make use of the fact that it is cheap
to send email. Even the smallest returns on spam amount to good money. By
making it more expensive to send spam, it would be phased out as the returns
become negative. Charging money like AOL tried, would work; but it is not a
good approach, not only does it not allow for senders anonymity but also it
rewards mail-administrators for doing a bad job (the more spam we deliver
the more money we make).
Yes, it is unfortunately true, money complicates things. Money tends to be
the reason why superior design fails to take hold, and instead something
inferior wins - think Betamax vs. VHS. This is why I think something similar
would happen with competing mail transfer protocols - the one with most
corporate backing would win, not the one that's best for people.
Another approach is to make the sender interact with the recipient mail
server by some kind of challenge authentication which is hard to compute
for a machine but easy for a human, a Turing test. For example the
recipient can ask the senders client to verify what is written on an
obfuscated image (CAPTCHA) or what is being said on a audio clip, or both
so as to minimize the effect on people with handicaps.
Nice thought about the handicapped, but you are forgetting that only 800-900
million people speak English (see
Wikipedia).
That is something on the order of 13-15 percent. Sorry, but "listening
comprehension" tests are simply not viable.
Obfuscated image CAPTCHAs
are "less" of a problem, but then again, one should consider the blind. I am
not blind, and as a matter of fact my vision is still rather good (even
after years of staring at computer screens), but at times I'm not sure what
those "distorted text" CAPTCHAs are even displaying. I can't even begin to
imagine what it must be like for anyone with poor vision.
You seem to be making the assumption that most if not all legitimate email
comes from humans. While that may be true for your average home user, let's
not forget that email is used by more technical people as well. These people
will, and do, use email in creative ways. For example, take me...I receive
lots of emails that are generated by all sorts of scripts that I wrote over
time. These emails give me status of a number of systems I care about, and
reminders about upcoming events. All in all, you could say that I live
inside email. You can't do a CAPTCHA for the process sending the automated
email (there's no human sending it), and if you do the CAPTCHA for the
receiving, you're just adding a "click here to display the message" wart to
the mail client software user interface.
Just keep in mind that all those automated emails you get from "root" or
even yourself were sent without a human answering a CAPTCHA.
It would be essential to also white list senders so that they do not have
to preform a user-interactive challenge to send the email, such that mail
from legitimate automated mass senders would get through (and for that
current implementation of sieve scripts could be used). In this system,
if users were to make wide use of filters, we would soon see a problem. If
nearly everyone has a white list entry for Bank Of America what is to
prevent a spammer to try to impersonate that bank?
White listing is really annoying, and as you point out, it doesn't work.
And so this brings us to the next point, authentication, how do you know
that the email actually did, originate from the sender. This is one of the
largest problems with SMTP as it is so easy to fake ones outgoing email
address. The white list has to rely on a verifiable and consistent flag in
the email. A sample implementation of such a control could work similar
to the current hack to the email system, SPF, in which a special entry is
made in the DNS entry which says where the mail can originate from. While
this approach is quite effective in a sever-server architecture it would
not work in a client-server architecture. Part of the protocol could
require the sending client to send a cryptographic-hash of the email to
his own receiving mail server, so that the receiving party's mail server
could verify the authenticity of the source of the email. In essence this
creates a 3 way handshake between the senders client, the senders
(receiving) mail server and the receiver's mail server.
I tend to stay away from making custom authentication protocols.
In this scheme, what guarantees you that the client and his "home server"
aren't both trying to convince the receiving server that the email is really
from whom they say it is? In kerberos, you have a key for each system, and a
password for each user. The kerberos server knows it all, and this central
authority is why things work. With SSL certificates, you rely on the
strength of the crypto used, as well as blind faith in the certificate
authority.
At first it might seem that this process uses up more bandwidth and
increases the delay of sending mail but one has to remember that in usual
configuration of sending email using IMAP or POP for mail storage one
undergoes a similar process,
Umm...while possible, I believe that very very large majority of email is
sent via SMTP (and I'm not even counting all the spam).
first email is sent for storage (over IMAP or POP) to the senders mail
server and then it is sent over SMTP to the senders email for redirection
to the receivers mail server. It is even feasible to implement hooks in
the IMAP and POP stacks to talk to the mail sending daemon directly
eliminating an additional socket connection by the client.
Why would you want to stick with IMAP and POP? They do share certain ideas
with SMTP.
For legitimate mass mail this process would not encumber the sending
procedure as for this case the sending server would be located on the same
machine as the senders receiving mail server (which would store the hash for
authentication), and they could even be streamlined into one monolithic
process.
Not necessarily. There are entire businesses that specialize in mailing list
maintenance. You pay them, and they give you an account with software that
maintains your mailing list. Actually, it's amusing how similar it is to
what spammers do. The major difference is that in the legitimate case, the
customer supplies their own list of email address to mail. Anyway, my point
is, in these cases (and they are more common than you think) the mailing
sender is on a different computer than the "from" domain's MX record.
Some might argue that phasing out SMTP is a extremely radical idea, it has
been an essential part of the internet for 25 years.
Radical? Sure. But my problem is that there is no replacement. All the ideas
you have listed have multiple problems - all of which have been identified
by others. And so here we are, no closer to the solution.
But then, when is the right time to phase out this archaic and obsolete
protocol, or do we commit to use it for the foreseeable future. Then
longer we wait to try to phase it out the longer it will take to adopt
something new. This protocol should be designed with a way to coexist with
SMTP to get over the adoption curve, id est, make it possible for client
to check for recipients functionality, if it can accept email by this new
protocol then send it by it rather than SMTP.
Sounds great! What is this protocol that would replace SMTP? Oh, right there
isn't one.
The implementation of such a protocol would take very little time, the
biggest problem would be with adoption.
Sad, but true.
The best approach for this problem is to entice several large mail
providers (such as Gmail or Yahoo) to switch over. Since these providers
handle a large fraction of all mail the smaller guys (like myself) would
have to follow suit.
You mentioned Gmail...well, last I heard, Gmail's servers were acting as
open proxies. Congratulations! One of your example "if they switch things
will be better" email providers is allowing the current spam problem to go
on. I guess that makes you right. If Gmail were to use a protocol that
didn't allow for spam to exist, then things would be better.
There is even an incentive for mail providers to re-implement mail
protocol, it would save them many CPU-cycles since Bayesian-spam-filters
would no longer be that important.
What about generating all those CAPTCHAs you suggested? What about hashing
all those emails? Neither activity is free.
By creating this new protocol we would dramatically improve an end users
experience online, as there would be fewer annoyances to deal with.
Hopefully alleviation of these annoyances will bring faster adoption of
the protocol.
I really can't help but read that as "If we use this magical protocol that
will make things better, things will get better!" Sorry, but unless I see
some protocol which would be a good candidate, I will remain sceptical.
As a side note, over the past ~90 days, I received about 164MB of spam that
SpamAssassin caught and
procmail promptly shoved into the
spam mail box. Do I care? Not enough to jump on the "let's reinvent the
email system" bandwagon. Sure, it eats up some of my servers clock cycles,
and some bandwidth, the spam that gets to me is the few pieces that manage
to get through, and show up in my inbox. Would I be happy if I didn't have
to use a spam filter, and not have to delete the few random spams by hand?
Sure, but at least for the moment, I don't see a viable alternative.