Josef “Jeff” Sipek

Unix Shared Memory

While investigating whether some memory management code was still in use (I’ll blahg about this in the future), I ended up learning quite a bit about shared memory on Unix systems. Since I managed to run into a couple of non-obvious snags while trying to get a simple test program running, I thought I’d share my findings here for my future self.

All in all, there are three ways to share memory between processes on a modern Unix system.

System V shm

This is the oldest of the three. First you call shmget to set up a shared memory segment and then you call shmat to map it into your address space. Here’s a quick example that does not do any error checking or cleanup:

void sysv_shm()
{
        int ret;
        void *ptr;

        ret = shmget(0x1234, 4096, IPC_CREAT);
        printf("shmget returned %d (%d: %s)\n", ret, errno,
               strerror(errno));

        ptr = shmat(ret, NULL, SHM_PAGEABLE | SHM_RND);
        printf("shmat returned %p (%d: %s)\n", ptr, errno, strerror(errno));
}

What’s so tricky about this? Well, by default Illumos’s shmat will return EPERM unless you are root. This sort of makes sense given how this flavor of shared memory is implemented. (Hint: it’s all in the kernel)

POSIX shm

As is frequently the case, POSIX came up with a different interface and different semantics for shared memory. Here’s the POSIX shm version of the above function:

void posix_shm()
{
	int fd;
	void *ptr;

	fd = shm_open("/blah", O_RDWR | O_CREAT, 0666);
	printf("shm_open returned %d (%d: %s)\n", fd, errno,
	       strerror(errno));

	ftruncate(fd, 4096); /* IMPORTANT! */

	ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	printf("mmap returned %p (%d: %s)\n", ptr, errno, strerror(errno));
}

The very important part here is the ftruncate call. Without it, shm_open may create an empty file and mmaping an empty file won’t work very well. (Well, on Illumos mmap succeeds, but you effectively have a 0-length mapping so any loads or stores will result in a SIGBUS. I haven’t tried other OSes.)

Aside from the funny looking path (it must start with a slash, but cannot contain any other slashes), shm_open looks remarkably like the open system call. It turns out that at least on Illumos, shm_open is implemented entirely in libc. The implementation creates a file in /tmp based on the path provided and the file descriptor that it returns is actually a file descriptor for this file in /tmp. For example, “/blah” input translates into “/tmp/.SHMDblah”. (There is a second file “/tmp/.SHMLblah” that doesn’t live very long. I think it is a lock file.) The subsequent mmap call doesn’t have any idea that this file is special in any way.

Does this mean that you can reach around shm_open and manipulate the object directly? Not exactly. POSIX states: “It is unspecified whether the name appears in the file system and is visible to other functions that take pathnames as arguments.”

The big difference between POSIX and SysV shared memory is how you refer to the segment — SysV uses a numeric key, while POSIX uses a path.

mmap

The last way of sharing memory involves no specialized APIs. It’s just plain ol’ mmap on an open file. For completeness, here’s the function:

void mmap_shm()
{
	int fd;
	void *ptr;

	fd = open("/tmp/blah", O_RDWR | O_CREAT, 0666);
	printf("open returned %d (%d: %s)\n", fd, errno, strerror(errno));

	ftruncate(fd, 4096); /* IMPORTANT! */

	ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	printf("mmap returned %p (%d: %s)\n", ptr, errno, strerror(errno));
}

It is very similar to the POSIX shm code example. As before, we need the ftruncate to make the shared file non-empty.

pmap

In case you’ve wondered what SysV or POSIX shm segments look like on Illumos, here’s the pmap output for a process that basically runs the first two examples above.

6343:	./a.out
0000000000400000          8K r-x--  /storage/home/jeffpc/src/shm/a.out
0000000000411000          4K rw---  /storage/home/jeffpc/src/shm/a.out
0000000000412000         16K rw---    [ heap ]
FFFFFD7FFF160000          4K rwxs-    [ dism shmid=0x13 ]
FFFFFD7FFF170000          4K rw-s-  /tmp/.SHMDblah
FFFFFD7FFF180000         24K rwx--    [ anon ]
FFFFFD7FFF190000          4K rwx--    [ anon ]
FFFFFD7FFF1A0000       1596K r-x--  /lib/amd64/libc.so.1
FFFFFD7FFF33F000         52K rw---  /lib/amd64/libc.so.1
FFFFFD7FFF34C000          8K rw---  /lib/amd64/libc.so.1
FFFFFD7FFF350000          4K rwx--    [ anon ]
FFFFFD7FFF360000          4K rwx--    [ anon ]
FFFFFD7FFF370000          4K rw---    [ anon ]
FFFFFD7FFF380000          4K rw---    [ anon ]
FFFFFD7FFF390000          4K rwx--    [ anon ]
FFFFFD7FFF393000        348K r-x--  /lib/amd64/ld.so.1
FFFFFD7FFF3FA000         12K rwx--  /lib/amd64/ld.so.1
FFFFFD7FFF3FD000          8K rwx--  /lib/amd64/ld.so.1
FFFFFD7FFFDFD000         12K rw---    [ stack ]
         total         2120K

You can see that the POSIX shm file got mapped in the standard way (address FFFFFD7FFF170000). The SysV shm segment is special — it is not a plain old memory map (address FFFFFD7FFF160000).

That’s it for today. I’m going to talk about segment types in the different post in the near future.

0 Comments »

Atom feed for comments on this post.

Leave a comment

Powered by blahgd