Josef “Jeff” Sipek

Time-based One-time Passwords

Recently I ended up playing with Wikipedia article: Time-based One-time Passwords as a second factor when authenticating with various services. When I saw an RFC referenced in the references section, I looked at it to get a better idea of how complicated the algorithm really is. It turns out that TOTP is very simple. So simple that I couldn’t help but put together a quick and dirty implementation in Python.

TOTP itself is documented in RFC 6238. It is a rather short RFC, but that’s because all it really says is “use HOTP and feed it these values”.

HOTP is documented in RFC 4226. This RFC is a bit longer since it has to describe how the counter value gets hashed and the resulting digest gets mangled. Reading it, one will learn that the HMAC-SHA1 is the basic building block of HOTP.

HMAC is documented in RFC 2104.

With these three documents (and a working implementation of SHA1), it is possible to implement your own TOTP.

The Key

If you follow those four RFCs, you’ll have a working TOTP. However, that’s not enough to make use of the code. The whole algorithm is predicated on having a pre-shared secret—a key. Typically, the service you are enabling TOTP for will issue you a key and you have to feed it into the algorithm to start generating passwords. Since showing the user the key in binary is not feasible, some sort of encoding is needed.

I couldn’t find any RFC that documents best practices for sharing the key with the user. After a while, I found a Google Authenticator wiki page describing the format of the key URIs used by Google Authenticator.

It turns out that this is a very common format. It uses a base32 encoding with the padding stripped. (Base32 is documented in RFC 4648.)

The “tricky” part is recreating this padding to make the decoder happy. Since base32 works on 40-bit groups (it converts between 5 raw bytes and 8 base-32 chars), we must pad to the nearest 40-bit group.

The Code

I tried to avoid implementing HMAC-SHA1, but I couldn’t find it in any of the modules Python ships with. Since it is a simple enough algorithm, I implemented it as well. Sadly, it nearly doubles the size of the code.

Warning: This is proof-of-concept quality code. Do not use it in production.

import struct
import hashlib
import base64
import time

# The pre-shared secret (base32 encoded):
key = "VGMT4NSHA2AWVOR6"

def HMAC(k, data, B=64):
    def H(m):
        return hashlib.sha1(m).digest()

    # keys too long get hashed
    if len(k) > B:
        k = H(k)

    # keys too short get padded
    if len(k) < B:
        k = k + ("\x00" * (B - len(k)))

    ikey = "".join([chr(ord(x) ^ 0x36) for x in k])
    okey = "".join([chr(ord(x) ^ 0x5c) for x in k])

    return H(okey + H(ikey + data))

def hotp(K, C, DIGITS=6):
    def Truncate(inp):
        off = ord(inp[19]) & 0xf

        x = [ord(x) for x in inp[off:(off+4)]]

        return ((x[0] & 0x7f) << 24) | (x[1] << 16) | (x[2] << 8) | x[3]

    return Truncate(HMAC(K, struct.pack(">Q", C))) % (10 ** DIGITS)

def totp(K, T=time.time(), X=30, T0=0, DIGITS=6):
    return hotp(K, long(T - T0) / long(X), DIGITS=DIGITS)

# pad to the nearest 40-bit group
if len(key) % 8 != 0:
    key=key + ("=" * (8 - (len(key) % 8)))

key=base64.b32decode(key.upper())

print time.ctime(), time.time()
print "TOTP: %06d" % totp(key)

This code is far from optimal, but I think it nicely demonstrates the simplicity of TOTP.

References

GNU inline vs. C99 inline

Recently, I’ve been looking at inline functions in C. However instead of just the usual static inlines, I’ve been looking at all the variants. This used to be a pretty straightforward GNU C extension and then C99 introduced the inline keyword officially. Sadly, for whatever reason decided that the semantics would be just different enough to confuse me and everyone else.

GCC documentation has the following to say:

GCC implements three different semantics of declaring a function inline. One is available with -std=gnu89 or -fgnu89-inline or when gnu_inline attribute is present on all inline declarations, another when -std=c99, -std=c11, -std=gnu99 or -std=gnu11 (without -fgnu89-inline), and the third is used when compiling C++.

Dang! Ok, I don’t really care about C++, so there are only two ways inline can behave.

Before diving into the two different behaviors, there are two cases to consider: the use of an inline function, and the inline function itself. The good news is that the use of an inline function behaves the same in both C90 and C99. Where the behavior changes is how the compiler deals with the inline function itself.

After reading the GCC documentation and skimming the C99 standard, I have put it all into the following table. It lists the different ways of using the inline keyword and for each use whether or not a symbol is produced in C90 (with inline extension) and in C99.

Emit (C90) Emit (C99)
inline always never
static inline maybe maybe
extern inline never always

(“always” means that a global symbol is always produced regardless of if all the uses of it are inlined. “maybe” means that a local symbol will be produced if and only if some uses cannot be inlined. “never” means that no symbols are produced and any non-inlined uses will be dealt with via relocations to an external symbol.)

Note that C99 “switched” the meaning of inline and extern inline. The good news is, static inline is totally unaffected (and generally the most useful).

For whatever reason, I cannot ever remember this difference. My hope is that this post will help me in the future.

Trying it Out

We can verify this experimentally. We can compile the following C file with -std=gnu89 and -std=gnu99 and compare what symbols the compiler produces:

static inline void si(int x)
{
}

extern inline void ei(int x)
{
}

inline void i(int x)
{
}

And here’s what nm has to say about them:

test-gcc89:
00000000 T i

test-gcc99:
00000000 T ei

This is an extremely simple example where the “never” and “maybe” cases all skip generating a symbol. In a more involved program that has inline functions that use features of C that prevent inlining (e.g., VLAs) we would see either relocations to external symbols or local symbols.

Generating Random Data

Over the years, there have been occasions when I needed to generate random data to feed into whatever system. At times, simply using /dev/random or /dev/urandom was sufficient. At other times, I needed to generate random data at a rate that exceeded what I could get out of /dev/random. This morning, I read Chris’s blog entry about his need for generating lots of random data. I decided that I should write my favorite approach so that others can benefit.

The approach is very simple. There are two phases. First, we set up our own random pool. Second we use the random pool. I am going to use an example throughout the rest of this post. Suppose that we want to make repeated 128 kB writes to a block device and we want the data to be random so that the device can’t do anything clever (e.g., compress or dedup). Say that during this benchmark we want to write out 64 GB total. (In other words, we will issue 524288 writes.)

Setup Phase

During the setup phase, we create a pool of random data. The easiest way is to just read /dev/urandom. Here, we want to read enough data so that the pool is large enough. For our 128kB write example, we’d want at least 1 MB. (I’ll explain the sizing later. I would probably go with something like 8 MB because unless I’m on some sort of limited system, the extra 7 MB of RAM won’t be missed.)

“Using the Pool” Phase

Now that we have the pool, we can use it to generate random buffers. The basic idea is to pick a random offset into the pool and just grab the bytes starting at that location. In our example, we’d pick a random offset between zero and pool size minus 128 kB, and use the 128 kB at that offset.

In pseudo code:

#define BUF_SIZE	(128 * 1024)
#define POOL_SIZE	(1024 * 1024)

static char pool[POOL_SIZE];

char *generate()
{
	return &pool[rand() % (POOL_SIZE - BUF_SIZE)];
}

That’s it! You can of course make it more general and let the caller tell you how many bytes they want:

#define POOL_SIZE	(1024 * 1024)

static char pool[POOL_SIZE];

char *generate(size_t len)
{
	return &pool[rand() % (POOL_SIZE - len)];
}

It takes a pretty simple argument to show that even a modest sized pool will be able to generate lots of different random buffers. Let’s say we’re dealing with the 128 kB buffer and 1 MB pool case. The pool can return 128 kB starting at offset 0, or offset 1, or offset 2, … or offset 9175043 (1MB-128kB-1B). This means that there are 917504 possible outputs. Recall, that in our example we were planning on writing out 64 GB in total which was 524288 writes.

524288917504=0.571

In other words, we are planning on using less than 58% of the possible outputs from our 1 MB pool to write out 64 GB of random data! (An 8 MB pool would yield 6.3% usage.)

If the length is variable, the math gets more complicated, but in a way we get even better results (i.e., lower usage) because to generate the same buffer we would need have the same offset and length. If the caller supplies random (pseudo-random or based on some distribution) lengths, we’re very unlikely to get the same buffer out of the pool.

Observations

Some of you may have noticed that we traded generating 128 kB (or a user supplied length) of random data for generating a random integer. There are two options there, either you can use a fast pseudo-random number generator (like the Wikipedia article: Mersenne twister), or you can reuse same pool! In other words:

#define POOL_SIZE	(1024 * 1024)

static char pool[POOL_SIZE];
static size_t *ridx = (size_t *) pool;

char *generate(size_t len)
{
	if ((uintptr_t) ridx == (uintptr_t) &pool[POOL_SIZE])
		ridx = (size_t *) pool;

	ridx++;

	return &pool[*ridx % (POOL_SIZE - len)];
}

I leave it as an exercise for the reader to either make it multi-thread safe, or to make the index passed in similarly to how rand_r takes an argument.

rand_r Considered Harmful

Since we’re on the topic of random number generation, I thought I’d mention what is already rather widely known fact — libc’s rand and rand_r implementations are terrible. At one point, I tried using them with this approach to generate random buffers, but it didn’t take very long before I got repeats! Caveat emptor.

Unix Shared Memory

While investigating whether some memory management code was still in use (I’ll blahg about this in the future), I ended up learning quite a bit about shared memory on Unix systems. Since I managed to run into a couple of non-obvious snags while trying to get a simple test program running, I thought I’d share my findings here for my future self.

All in all, there are three ways to share memory between processes on a modern Unix system.

System V shm

This is the oldest of the three. First you call shmget to set up a shared memory segment and then you call shmat to map it into your address space. Here’s a quick example that does not do any error checking or cleanup:

void sysv_shm()
{
        int ret;
        void *ptr;

        ret = shmget(0x1234, 4096, IPC_CREAT);
        printf("shmget returned %d (%d: %s)\n", ret, errno,
               strerror(errno));

        ptr = shmat(ret, NULL, SHM_PAGEABLE | SHM_RND);
        printf("shmat returned %p (%d: %s)\n", ptr, errno, strerror(errno));
}

What’s so tricky about this? Well, by default Illumos’s shmat will return EPERM unless you are root. This sort of makes sense given how this flavor of shared memory is implemented. (Hint: it’s all in the kernel)

POSIX shm

As is frequently the case, POSIX came up with a different interface and different semantics for shared memory. Here’s the POSIX shm version of the above function:

void posix_shm()
{
	int fd;
	void *ptr;

	fd = shm_open("/blah", O_RDWR | O_CREAT, 0666);
	printf("shm_open returned %d (%d: %s)\n", fd, errno,
	       strerror(errno));

	ftruncate(fd, 4096); /* IMPORTANT! */

	ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	printf("mmap returned %p (%d: %s)\n", ptr, errno, strerror(errno));
}

The very important part here is the ftruncate call. Without it, shm_open may create an empty file and mmaping an empty file won’t work very well. (Well, on Illumos mmap succeeds, but you effectively have a 0-length mapping so any loads or stores will result in a SIGBUS. I haven’t tried other OSes.)

Aside from the funny looking path (it must start with a slash, but cannot contain any other slashes), shm_open looks remarkably like the open system call. It turns out that at least on Illumos, shm_open is implemented entirely in libc. The implementation creates a file in /tmp based on the path provided and the file descriptor that it returns is actually a file descriptor for this file in /tmp. For example, “/blah” input translates into “/tmp/.SHMDblah”. (There is a second file “/tmp/.SHMLblah” that doesn’t live very long. I think it is a lock file.) The subsequent mmap call doesn’t have any idea that this file is special in any way.

Does this mean that you can reach around shm_open and manipulate the object directly? Not exactly. POSIX states: “It is unspecified whether the name appears in the file system and is visible to other functions that take pathnames as arguments.”

The big difference between POSIX and SysV shared memory is how you refer to the segment — SysV uses a numeric key, while POSIX uses a path.

mmap

The last way of sharing memory involves no specialized APIs. It’s just plain ol’ mmap on an open file. For completeness, here’s the function:

void mmap_shm()
{
	int fd;
	void *ptr;

	fd = open("/tmp/blah", O_RDWR | O_CREAT, 0666);
	printf("open returned %d (%d: %s)\n", fd, errno, strerror(errno));

	ftruncate(fd, 4096); /* IMPORTANT! */

	ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	printf("mmap returned %p (%d: %s)\n", ptr, errno, strerror(errno));
}

It is very similar to the POSIX shm code example. As before, we need the ftruncate to make the shared file non-empty.

pmap

In case you’ve wondered what SysV or POSIX shm segments look like on Illumos, here’s the pmap output for a process that basically runs the first two examples above.

6343:	./a.out
0000000000400000          8K r-x--  /storage/home/jeffpc/src/shm/a.out
0000000000411000          4K rw---  /storage/home/jeffpc/src/shm/a.out
0000000000412000         16K rw---    [ heap ]
FFFFFD7FFF160000          4K rwxs-    [ dism shmid=0x13 ]
FFFFFD7FFF170000          4K rw-s-  /tmp/.SHMDblah
FFFFFD7FFF180000         24K rwx--    [ anon ]
FFFFFD7FFF190000          4K rwx--    [ anon ]
FFFFFD7FFF1A0000       1596K r-x--  /lib/amd64/libc.so.1
FFFFFD7FFF33F000         52K rw---  /lib/amd64/libc.so.1
FFFFFD7FFF34C000          8K rw---  /lib/amd64/libc.so.1
FFFFFD7FFF350000          4K rwx--    [ anon ]
FFFFFD7FFF360000          4K rwx--    [ anon ]
FFFFFD7FFF370000          4K rw---    [ anon ]
FFFFFD7FFF380000          4K rw---    [ anon ]
FFFFFD7FFF390000          4K rwx--    [ anon ]
FFFFFD7FFF393000        348K r-x--  /lib/amd64/ld.so.1
FFFFFD7FFF3FA000         12K rwx--  /lib/amd64/ld.so.1
FFFFFD7FFF3FD000          8K rwx--  /lib/amd64/ld.so.1
FFFFFD7FFFDFD000         12K rw---    [ stack ]
         total         2120K

You can see that the POSIX shm file got mapped in the standard way (address FFFFFD7FFF170000). The SysV shm segment is special — it is not a plain old memory map (address FFFFFD7FFF160000).

That’s it for today. I’m going to talk about segment types in the different post in the near future.

Designated Initializers

Designated initializers are a neat feature in C99 that I’ve used for about 6 years. I can’t fathom why anyone would not use them if C99 is available. (Of course if you have to support pre-C99 compilers, you’re very sad.) In case you’ve never seen them, consider this example that’s perfectly valid C99:

int abc[7] = {
	[1] = 0xabc,
	[2] = 0x12345678,
	[3] = 0x12345678,
	[4] = 0x12345678,
	[5] = 0xdef,
};

As you may have guessed, indices 1–5 will have the specified value. Indices 0 and 6 will be zero. Cool, eh?

GCC Extensions

Today I learned about a neat GNU extension in GCC to designated initializers. Consider this code snippet:

int abc[7] = {
	[1] = 0xabc,
	[2 ... 5] = 0x12345678,
	[5] = 0xdef,
};

Mind blowing, isn’t it?

Beware, however… GCC’s -std=c99 will not error out if you use ranges! You need to throw in -pedantic to get a warning.

$ gcc -c -Wall -std=c99 test.c
$ gcc -c -Wall -pedantic -std=c99 test.c
test.c:2:5: warning: ISO C forbids specifying range of elements to initialize [-pedantic]

nftw(3)

I just found out about nftw — a libc function to walk a file tree. I did not realize that libc had such a high-level function. I bet it’ll end up saving me time (and code) at some point in the future.

int nftw(const char *path, int (*fn) (const char *, const struct stat *,
				      int, struct FTW *), int depth,
	 int flags);

Given a path, it executes the callback for each file and directory in that tree. Very cool.

Ok, as I write this post, I am told that nftw is a great way to write dangerous code. Aside from easily writing dangerous things equivalent to rm -rf, I could see not specifying the FTW_PHYS to be dangerous as symlinks will get followed without any notification.

I guess I’ll play around with it a little to see what the right way (if any?) to use it is.

SMTP

This is a (long overdue) reply to Ilya’s post: SMPT – Time to chuck it.

I’m going to quote it here, and reply to everything in it. Whenever I say "you," I mean Ilya. So, with that said, let’s get started.

E-mail, in particular SMTP (Simple Mail Transfer Protocol) has become an integral part of our lives, people routinely rely on it to send files, and messages. At the inception of SMTP the Internet was only accessible to a relatively small, close nit community; and as a result the architects of SMTP did not envision problems such as SPAM and sender-spoofing. Today, as the Internet has become more accessible, scrupulous people are making use of flaws in SMTP for their profit at the expense of the average Internet user.

Alright, this is pretty much the only thing I agree with.

There have been several attempts to bring this ancient protocol in-line with the current society but the problem of spam keeps creeping in. At first people had implemented simple filters to get rid of SPAM but as the sheer volume of SPAM increased mere filtering became impractical, and so we saw the advent of adaptive SPAM filters which automatically learned to identify and differentiate legitimate email from SPAM. Soon enough the spammers caught on and started embedding their ads into images where they could not be easily parsed by spam filters.

A history lesson…still fine.

AOL (America On Line) flirted with other ideas to control spam, imposing email tax on all email which would be delivered to its user. It seems like such a system might work but it stands in the way of the open principles which have been so important to the flourishing of the internet.

AOL (I believe Microsoft had a similar idea) really managed to think of something truly repulsive. The postal system in the USA didn’t always work the way it does today. A long time ago, the recipient paid for the delivery. AOL’s idea seems a lot like that.

There are two apparent problems at the root of the SMTP protocol which allow for easy manipulation: lack of authentication and sender validation, and lack of user interaction. It would not be difficult to design a more flexible protocol which would allow for us to enjoy the functionality that we are familiar with all the while address some, if not all of the problems within SMTP.

To allow for greater flexibility in the protocol, it would first be broken from a server-server model into a client-server model.

This is first point I 100% disagree with…

That is, traditionally when one would send mail, it would be sent to a local SMTP server which would then relay the message onto the next server until the email reached its destination. This approach allowed for email caching and delayed-send (when a (receiving) mail server was off-line for hours (or even days) on end, messages could still trickle through as the sending server would try to periodically resend the messages.) Todays mail servers have very high up times and many are redundant so caching email for delayed delivery is not very important.

"Delayed delivery is not very important"?! What? What happened to the whole "better late than never" idiom?

It is not just about uptime of the server. There are other variables one must consider when thinking about the whole system of delivering email. Here’s a short list; I’m sure I’m forgetting something:

  • server uptime
  • server reliability
  • network connection (all the routers between the server and the "source") uptime
  • network connection reliability

It does little to no good if the network connection is flakey. Ilya is arguing that that’s rarely the case, and while I must agree that it isn’t as bad as it used to be back in the 80’s, I also know from experience that networks are very fragile and it doesn’t take much to break them.

A couple of times over the past few years, I noticed that my ISP’s routing tables got screwed up. Within two hours of such a screwup, things returned to normal, but that’s 2 hours of "downtime."

Another instance of a network going haywire: one day, at Stony Brook University, the internet connection stopped working. Apparently, a compromised machine on the university campus caused a campus edge device to become overwhelmed. This eventually lead to a complete failure of the device. It took almost a day until the compromised machine got disconnected, the failed device reset, and the backlog of all the traffic on both sides of the router settled down.

Failures happen. Network failures happen frequently. More frequently that I would like them to, more frequently than the network admins would like them to. Failures happen near the user, far away from the user. One can hope that dynamic routing tables keep the internet as a whole functioning, but even those can fail. Want an example? Sure. Not that long ago, the well know video repository YouTube disappeared off the face of the Earth…well, to some degree. As this RIPE NCC RIS case study shows, on February 24, 2008, Pakistan Telecom decided to announce BGP routes for YouTube’s IP range. The result was, that if you tried to access any of YouTube’s servers on the 208.65.152.0/22 subnet, your packets were directed to Pakistan. For about an hour and twenty minutes that was the case. Then YouTube started announcing more granular subnets, diverting some of the traffic back to itself. Eleven minutes later, YouTube announced even more granular subnets, diverting large bulk of the traffic back to itself. Few dozen minutes later, PCCW Global (Pakistan Telecom’s provider responsible for forwarding the "offending" BGP announcements to the rest of the world) stopped forwarding the incorrect routing information.

So, networks are fragile, which is why having an email transfer protocol that allows for retransmission a good idea.

Instead, having direct communication between the sender-client and the receiver-server has many advantages: opens up the possibility for CAPTCHA systems, makes the send-portion of the protocol easier to upgrade, and allows for new functionality in the protocol.

Wow. So much to disagree with!

  • CAPTCHA doesn’t work
  • What about mailing lists? How does the mailing list server answer the CAPTCHAs?
  • How does eliminating server-to-server communication make the protocol easier to upgrade?
  • New functionality is a nice thing in theory, but what do you want from your mail transfer protocol? I, personally, want it to transfer my email between where I send it from and where it is supposed to be delivered to.
  • If anything eliminating the server-to-server communication would cause the MUAs to be "in charge" of the protocols. This means that at first there would be many competing protocols, until one takes over - not necessarily the better one (Betamax vs. VHS comes to mind).
  • What happens in the case of overzealous firewall admins? What if I really want to send email to bob@example.com, but the firewall (for whatever reason) is blocking all traffic to example.com?

Spam is driven by profit, the spammers make use of the fact that it is cheap to send email. Even the smallest returns on spam amount to good money. By making it more expensive to send spam, it would be phased out as the returns become negative. Charging money like AOL tried, would work; but it is not a good approach, not only does it not allow for senders anonymity but also it rewards mail-administrators for doing a bad job (the more spam we deliver the more money we make).

Yes, it is unfortunately true, money complicates things. Money tends to be the reason why superior design fails to take hold, and instead something inferior wins - think Betamax vs. VHS. This is why I think something similar would happen with competing mail transfer protocols - the one with most corporate backing would win, not the one that’s best for people.

Another approach is to make the sender interact with the recipient mail server by some kind of challenge authentication which is hard to compute for a machine but easy for a human, a Turing test. For example the recipient can ask the senders client to verify what is written on an obfuscated image (CAPTCHA) or what is being said on a audio clip, or both so as to minimize the effect on people with handicaps.

Nice thought about the handicapped, but you are forgetting that only 800-900 million people speak English (see Wikipedia article: Wikipedia). That is something on the order of 13-15 percent. Sorry, but "listening comprehension" tests are simply not viable.

Obfuscated image Wikipedia article: CAPTCHAs are "less" of a problem, but then again, one should consider the blind. I am not blind, and as a matter of fact my vision is still rather good (even after years of staring at computer screens), but at times I’m not sure what those "distorted text" CAPTCHAs are even displaying. I can’t even begin to imagine what it must be like for anyone with poor vision.

You seem to be making the assumption that most if not all legitimate email comes from humans. While that may be true for your average home user, let’s not forget that email is used by more technical people as well. These people will, and do, use email in creative ways. For example, take me…I receive lots of emails that are generated by all sorts of scripts that I wrote over time. These emails give me status of a number of systems I care about, and reminders about upcoming events. All in all, you could say that I live inside email. You can’t do a CAPTCHA for the process sending the automated email (there’s no human sending it), and if you do the CAPTCHA for the receiving, you’re just adding a "click here to display the message" wart to the mail client software user interface.

Just keep in mind that all those automated emails you get from "root" or even yourself were sent without a human answering a CAPTCHA.

It would be essential to also white list senders so that they do not have to preform a user-interactive challenge to send the email, such that mail from legitimate automated mass senders would get through (and for that current implementation of sieve scripts could be used). In this system, if users were to make wide use of filters, we would soon see a problem. If nearly everyone has a white list entry for Bank Of America what is to prevent a spammer to try to impersonate that bank?

White listing is really annoying, and as you point out, it doesn’t work.

And so this brings us to the next point, authentication, how do you know that the email actually did, originate from the sender. This is one of the largest problems with SMTP as it is so easy to fake ones outgoing email address. The white list has to rely on a verifiable and consistent flag in the email. A sample implementation of such a control could work similar to the current hack to the email system, SPF, in which a special entry is made in the DNS entry which says where the mail can originate from. While this approach is quite effective in a sever-server architecture it would not work in a client-server architecture. Part of the protocol could require the sending client to send a cryptographic-hash of the email to his own receiving mail server, so that the receiving party’s mail server could verify the authenticity of the source of the email. In essence this creates a 3 way handshake between the senders client, the senders (receiving) mail server and the receiver’s mail server.

I tend to stay away from making custom authentication protocols.

In this scheme, what guarantees you that the client and his "home server" aren’t both trying to convince the receiving server that the email is really from whom they say it is? In kerberos, you have a key for each system, and a password for each user. The kerberos server knows it all, and this central authority is why things work. With SSL certificates, you rely on the strength of the crypto used, as well as blind faith in the certificate authority.

At first it might seem that this process uses up more bandwidth and increases the delay of sending mail but one has to remember that in usual configuration of sending email using IMAP or POP for mail storage one undergoes a similar process,

Umm…while possible, I believe that very very large majority of email is sent via SMTP (and I’m not even counting all the spam).

first email is sent for storage (over IMAP or POP) to the senders mail server and then it is sent over SMTP to the senders email for redirection to the receivers mail server. It is even feasible to implement hooks in the IMAP and POP stacks to talk to the mail sending daemon directly eliminating an additional socket connection by the client.

Why would you want to stick with IMAP and POP? They do share certain ideas with SMTP.

For legitimate mass mail this process would not encumber the sending procedure as for this case the sending server would be located on the same machine as the senders receiving mail server (which would store the hash for authentication), and they could even be streamlined into one monolithic process.

Not necessarily. There are entire businesses that specialize in mailing list maintenance. You pay them, and they give you an account with software that maintains your mailing list. Actually, it’s amusing how similar it is to what spammers do. The major difference is that in the legitimate case, the customer supplies their own list of email address to mail. Anyway, my point is, in these cases (and they are more common than you think) the mailing sender is on a different computer than the "from" domain’s MX record.

Some might argue that phasing out SMTP is a extremely radical idea, it has been an essential part of the internet for 25 years.

Radical? Sure. But my problem is that there is no replacement. All the ideas you have listed have multiple problems - all of which have been identified by others. And so here we are, no closer to the solution.

But then, when is the right time to phase out this archaic and obsolete protocol, or do we commit to use it for the foreseeable future. Then longer we wait to try to phase it out the longer it will take to adopt something new. This protocol should be designed with a way to coexist with SMTP to get over the adoption curve, id est, make it possible for client to check for recipients functionality, if it can accept email by this new protocol then send it by it rather than SMTP.

Sounds great! What is this protocol that would replace SMTP? Oh, right there isn’t one.

The implementation of such a protocol would take very little time, the biggest problem would be with adoption.

Sad, but true.

The best approach for this problem is to entice several large mail providers (such as Gmail or Yahoo) to switch over. Since these providers handle a large fraction of all mail the smaller guys (like myself) would have to follow suit.

You mentioned Gmail…well, last I heard, Gmail’s servers were acting as open proxies. Congratulations! One of your example "if they switch things will be better" email providers is allowing the current spam problem to go on. I guess that makes you right. If Gmail were to use a protocol that didn’t allow for spam to exist, then things would be better.

There is even an incentive for mail providers to re-implement mail protocol, it would save them many CPU-cycles since Bayesian-spam-filters would no longer be that important.

What about generating all those CAPTCHAs you suggested? What about hashing all those emails? Neither activity is free.

By creating this new protocol we would dramatically improve an end users experience online, as there would be fewer annoyances to deal with. Hopefully alleviation of these annoyances will bring faster adoption of the protocol.

I really can’t help but read that as "If we use this magical protocol that will make things better, things will get better!" Sorry, but unless I see some protocol which would be a good candidate, I will remain sceptical.

As a side note, over the past  90 days, I received about 164MB of spam that SpamAssassin caught and procmail promptly shoved into the spam mail box. Do I care? Not enough to jump on the "let’s reinvent the email system" bandwagon. Sure, it eats up some of my servers clock cycles, and some bandwidth, the spam that gets to me is the few pieces that manage to get through, and show up in my inbox. Would I be happy if I didn’t have to use a spam filter, and not have to delete the few random spams by hand? Sure, but at least for the moment, I don’t see a viable alternative.

New SMTP RFC

I just came across this… a new SMTP RFC (5321). I wonder what changed — I wish they did the change bar that IBM uses in its documents. :/

Powered by blahgd