Josef “Jeff” Sipek

bool bitfield:1

This is the first of hopefully many posts related to interesting pieces of code I’ve stumbled across in the dovecot repository.

Back in 1999, C99 added the bool type. This is old news. The thing I’ve never seen before is what amounts to:

struct foo {
	bool	a:1;
	bool	b:1;
};

Sure, I’ve seen bitfields before—just never with booleans. Since this is C, the obvious thing happens here. The compiler packs the two bool bits into a single byte. In other words, sizeof(struct foo) is 1 (instead of 2 had we not used bitfields).

The compiler emits pretty compact code as well. For example, suppose we have this simple function:

void set(struct foo *x)
{
	x->b = true;
}

We compile it and disassemble:

$ gcc -c -O2 -Wall -m64 test.c
$ dis -F set test.o
disassembly for test.o

set()
    set:     80 0f 02           orb    $0x2,(%rdi)
    set+0x3: c3                 ret

Had we used non-bitfield booleans, the resulting code would be:

set()
    set:     c6 47 01 01        movb   $0x1,0x1(%rdi)
    set+0x4: c3                 ret

There’s not much of a difference in these simple examples, but in more complicated structures with many boolean flags the structure size difference may be significant.

Of course, the usual caveats about bitfields apply (e.g., the machine’s endian matters).

GNU inline vs. C99 inline

Recently, I’ve been looking at inline functions in C. However instead of just the usual static inlines, I’ve been looking at all the variants. This used to be a pretty straightforward GNU C extension and then C99 introduced the inline keyword officially. Sadly, for whatever reason decided that the semantics would be just different enough to confuse me and everyone else.

GCC documentation has the following to say:

GCC implements three different semantics of declaring a function inline. One is available with -std=gnu89 or -fgnu89-inline or when gnu_inline attribute is present on all inline declarations, another when -std=c99, -std=c11, -std=gnu99 or -std=gnu11 (without -fgnu89-inline), and the third is used when compiling C++.

Dang! Ok, I don’t really care about C++, so there are only two ways inline can behave.

Before diving into the two different behaviors, there are two cases to consider: the use of an inline function, and the inline function itself. The good news is that the use of an inline function behaves the same in both C90 and C99. Where the behavior changes is how the compiler deals with the inline function itself.

After reading the GCC documentation and skimming the C99 standard, I have put it all into the following table. It lists the different ways of using the inline keyword and for each use whether or not a symbol is produced in C90 (with inline extension) and in C99.

Emit (C90) Emit (C99)
inline always never
static inline maybe maybe
extern inline never always

(“always” means that a global symbol is always produced regardless of if all the uses of it are inlined. “maybe” means that a local symbol will be produced if and only if some uses cannot be inlined. “never” means that no symbols are produced and any non-inlined uses will be dealt with via relocations to an external symbol.)

Note that C99 “switched” the meaning of inline and extern inline. The good news is, static inline is totally unaffected (and generally the most useful).

For whatever reason, I cannot ever remember this difference. My hope is that this post will help me in the future.

Trying it Out

We can verify this experimentally. We can compile the following C file with -std=gnu89 and -std=gnu99 and compare what symbols the compiler produces:

static inline void si(int x)
{
}

extern inline void ei(int x)
{
}

inline void i(int x)
{
}

And here’s what nm has to say about them:

test-gcc89:
00000000 T i

test-gcc99:
00000000 T ei

This is an extremely simple example where the “never” and “maybe” cases all skip generating a symbol. In a more involved program that has inline functions that use features of C that prevent inlining (e.g., VLAs) we would see either relocations to external symbols or local symbols.

Designated Initializers

Designated initializers are a neat feature in C99 that I’ve used for about 6 years. I can’t fathom why anyone would not use them if C99 is available. (Of course if you have to support pre-C99 compilers, you’re very sad.) In case you’ve never seen them, consider this example that’s perfectly valid C99:

int abc[7] = {
	[1] = 0xabc,
	[2] = 0x12345678,
	[3] = 0x12345678,
	[4] = 0x12345678,
	[5] = 0xdef,
};

As you may have guessed, indices 1–5 will have the specified value. Indices 0 and 6 will be zero. Cool, eh?

GCC Extensions

Today I learned about a neat GNU extension in GCC to designated initializers. Consider this code snippet:

int abc[7] = {
	[1] = 0xabc,
	[2 ... 5] = 0x12345678,
	[5] = 0xdef,
};

Mind blowing, isn’t it?

Beware, however… GCC’s -std=c99 will not error out if you use ranges! You need to throw in -pedantic to get a warning.

$ gcc -c -Wall -std=c99 test.c
$ gcc -c -Wall -pedantic -std=c99 test.c
test.c:2:5: warning: ISO C forbids specifying range of elements to initialize [-pedantic]

Powered by blahgd