Unix Shared Memory
While investigating whether some memory management code was still in use (I’ll blahg about this in the future), I ended up learning quite a bit about shared memory on Unix systems. Since I managed to run into a couple of non-obvious snags while trying to get a simple test program running, I thought I’d share my findings here for my future self.
All in all, there are three ways to share memory between processes on a modern Unix system.
System V shm
This is the oldest of the three. First you call shmget to set up a shared memory segment and then you call shmat to map it into your address space. Here’s a quick example that does not do any error checking or cleanup:
void sysv_shm() { int ret; void *ptr; ret = shmget(0x1234, 4096, IPC_CREAT); printf("shmget returned %d (%d: %s)\n", ret, errno, strerror(errno)); ptr = shmat(ret, NULL, SHM_PAGEABLE | SHM_RND); printf("shmat returned %p (%d: %s)\n", ptr, errno, strerror(errno)); }
What’s so tricky about this? Well, by default Illumos’s shmat will return EPERM unless you are root. This sort of makes sense given how this flavor of shared memory is implemented. (Hint: it’s all in the kernel)
POSIX shm
As is frequently the case, POSIX came up with a different interface and different semantics for shared memory. Here’s the POSIX shm version of the above function:
void posix_shm() { int fd; void *ptr; fd = shm_open("/blah", O_RDWR | O_CREAT, 0666); printf("shm_open returned %d (%d: %s)\n", fd, errno, strerror(errno)); ftruncate(fd, 4096); /* IMPORTANT! */ ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); printf("mmap returned %p (%d: %s)\n", ptr, errno, strerror(errno)); }
The very important part here is the ftruncate call. Without it, shm_open may create an empty file and mmaping an empty file won’t work very well. (Well, on Illumos mmap succeeds, but you effectively have a 0-length mapping so any loads or stores will result in a SIGBUS. I haven’t tried other OSes.)
Aside from the funny looking path (it must start with a slash, but cannot contain any other slashes), shm_open looks remarkably like the open system call. It turns out that at least on Illumos, shm_open is implemented entirely in libc. The implementation creates a file in /tmp based on the path provided and the file descriptor that it returns is actually a file descriptor for this file in /tmp. For example, “/blah” input translates into “/tmp/.SHMDblah”. (There is a second file “/tmp/.SHMLblah” that doesn’t live very long. I think it is a lock file.) The subsequent mmap call doesn’t have any idea that this file is special in any way.
Does this mean that you can reach around shm_open and manipulate the object directly? Not exactly. POSIX states: “It is unspecified whether the name appears in the file system and is visible to other functions that take pathnames as arguments.”
The big difference between POSIX and SysV shared memory is how you refer to the segment — SysV uses a numeric key, while POSIX uses a path.
mmap
The last way of sharing memory involves no specialized APIs. It’s just plain ol’ mmap on an open file. For completeness, here’s the function:
void mmap_shm() { int fd; void *ptr; fd = open("/tmp/blah", O_RDWR | O_CREAT, 0666); printf("open returned %d (%d: %s)\n", fd, errno, strerror(errno)); ftruncate(fd, 4096); /* IMPORTANT! */ ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); printf("mmap returned %p (%d: %s)\n", ptr, errno, strerror(errno)); }
It is very similar to the POSIX shm code example. As before, we need the ftruncate to make the shared file non-empty.
pmap
In case you’ve wondered what SysV or POSIX shm segments look like on Illumos, here’s the pmap output for a process that basically runs the first two examples above.
6343: ./a.out 0000000000400000 8K r-x-- /storage/home/jeffpc/src/shm/a.out 0000000000411000 4K rw--- /storage/home/jeffpc/src/shm/a.out 0000000000412000 16K rw--- [ heap ] FFFFFD7FFF160000 4K rwxs- [ dism shmid=0x13 ] FFFFFD7FFF170000 4K rw-s- /tmp/.SHMDblah FFFFFD7FFF180000 24K rwx-- [ anon ] FFFFFD7FFF190000 4K rwx-- [ anon ] FFFFFD7FFF1A0000 1596K r-x-- /lib/amd64/libc.so.1 FFFFFD7FFF33F000 52K rw--- /lib/amd64/libc.so.1 FFFFFD7FFF34C000 8K rw--- /lib/amd64/libc.so.1 FFFFFD7FFF350000 4K rwx-- [ anon ] FFFFFD7FFF360000 4K rwx-- [ anon ] FFFFFD7FFF370000 4K rw--- [ anon ] FFFFFD7FFF380000 4K rw--- [ anon ] FFFFFD7FFF390000 4K rwx-- [ anon ] FFFFFD7FFF393000 348K r-x-- /lib/amd64/ld.so.1 FFFFFD7FFF3FA000 12K rwx-- /lib/amd64/ld.so.1 FFFFFD7FFF3FD000 8K rwx-- /lib/amd64/ld.so.1 FFFFFD7FFFDFD000 12K rw--- [ stack ] total 2120K
You can see that the POSIX shm file got mapped in the standard way (address FFFFFD7FFF170000). The SysV shm segment is special — it is not a plain old memory map (address FFFFFD7FFF160000).
That’s it for today. I’m going to talk about segment types in the different post in the near future.