The C preprocessor is awesome part I

Well… It probably isn’t. But in this series of posts, I’m going to walk you through some terrible ideas which somehow vaguely work. None of this is a good idea, but it is fun to push simple text preprocessing and the C programming language to it’s illogical limit.

This series of posts was inspired by the crazy definition of ARRAY_SIZE somewhere floating in the Linux kernel source code, which I first became aware of in a post by a friend of mine. And of course I was also inspired by this classic which helped me come up with a few solutions to issues I was having while developing these.

All these examples are compiled with gcc. Your success with other compilers may vary.

I will assume familiarity with the C programming language and some comfort with the C preprocessor for the duration of this series.

Introduction

This series will come in three parts, with this being the first. I believe I’m staying away from undefined behaviour at least for the first 2 parts. The final part is probably very unsafe and I was only lucky that it worked for me, but it is fun nonetheless! In this post, I’ll show you how to write a simple unit test framework allowing you to write your unit tests in C as follows:

TEST("my fist test") {
    // do some unit testy stuff
}

TEST("my second test") {
    // more unit testy stuff
}

In the second part, I’ll show you how to write pseudo C# / Java constructs like lock or using as follows:

LOCK(my_mutex) {
    // do something with threads
}

USING(my_resource) {
    // this'll be disposed after use
}

And in the final part, I’ll show you how to (probably with loads of undefined behaviour) write a try-catch implementation:

TRY {
    do_something_dangerous();
} CATCH(SOME_EXCEPTION) {
    // this code will only throw if do_something_dangerous THROWS(SOME_EXCEPTION)
} FINALLY {
    // this code will always run
}

Along the way, I hope you’ll learn a few tricks that you should never use in production, but are cool anyway.

The problem I was trying to solve

Writing unit tests for your code is definitely a very good thing. Automated regression testing is essential for any project. But tests are often a pain to write. And if they are a pain to write, you are tempted to write fewer of them. This is clearly a bad thing. So what we’re going to put together here is a really simple unit testing framework for your project entirely in GNU C99. By the end of this, we’ll be able to write tests like this:

#include <test_helper.h>

TEST("my awesome test") {
    assert(1 == 1);
}

and, although I won’t spend too long working on the output system for this, you’ll be able to run all these tests easily!

An approach to automation

You could try writing crazy macros straight off, but it is probably easier to start with what we want the eventual code to look like first. So lets take the example above. We’ll assume the existence of some test_main.c file, which would contain code similar to this:

static void my_awesome_test(void) {
    // equivalent of assert(1 == 1) here
}

typedef void (*TestFunction)(void);

TestFunction testFunctions[] = {
    my_awesome_test
};

int main() {
    int numTests = sizeof(testFunctions) / sizeof(testFunctions[0]);
    for (int i = 0; i < numTests; i++) {
        testFunctions[i]();
    }
}

So I guess the test macro can be something like:

#define TEST(name) \
    static void name(void)

but this forces the test name to be a valid C identifier. And we might want to use a nicer string name for the test name (just because we can). So maybe we want to store the test along with its name in the test_main.c file?

Also, we want to store the name of the test along with the function anyway, since we would like to print the test names as we run them. That way, if a test fails, we know exactly which one it is.

#include <stdlib.h>

typedef void (*TestFunction)(void);

struct NamedTest {
    TestFunction test_function;
    const char *test_name;
    struct NamedTest *next;
};

struct NamedTest *tests;

void register_test(TestFunction test_function, const char *test_name) {
    struct NamedTest *test = malloc(sizeof(struct NamedTest));
    test->test_function = test_function;
    test->test_name = test_name;
    test->next = tests;
    tests = test;
}

int main() {
    register_test(my_awesome_test, "My awesome test");

    struct NamedTest *current_test = tests;
    while (current_test) {
        printf("Running test %s\n", current_test->test_name);
        current_test->test_function();

        struct NamedTest *next_test = current_test->next;
        free(current_test);
        current_test = next_test;
    }
}

This is pretty much the final form of the test main file. But using it does have one pretty major annoyance. If you want add a new test, you have to manually register it. So you end up repeating the test function name twice, and the string name is nowhere near the actual test definition. We will attempt to fix both of these problems.

GCC to the rescue

If you have a look through the GCC function annotations, you can find some interesting ones. The one we’re interested in here is constructor. This lets us state that a function should run before main(). If we can somehow register our test function as part of the TEST declaration, and therefore allow us to not have to repeat ourselves.

So we end up with a TEST macro that looks something like this:

#define TEST(test_name, test_function_name)                    \
    static void test_function_name(void);                      \
    __attribute__((constructor))                               \
    static void CONCAT(register_, test_function_name)(void) {   \
        register_test(&test_function_name, test_name);         \
    }                                                          \
    static void test_function_name (void)

Ignore the CONCAT definition for now. It concatenates two literals, but its definition is non-obvious. So I’ll cover that in the next section.

There’s a lot going on in the macro above, so it is probably easiest if we expand it

// This
TEST("My awesome test", my_awesome_test) {
    /* ... */
}

// Turns into this:
static void my_awesome_test(void);
__attribute__((constructor))
static void register_my_awesome_test(void) {
    register_test(&my_awesome_test, "My awesome test");
}
static void my_awesome_test(void) {
    /* ... */
}

The forward declaration is necessary because we don’t want to have to put anything after the final }. So all the work needs to be done in the TEST macro.

It’s slightly annoying that we still have to repeat ourselves here with both the test name and the function name. An easy way around that is using the __LINE__ ‘macro’. __LINE__ is defined to be the current line number. Since we expect users to write tests on different lines, if we put the line number after the test name, we can make all the function names unique without forcing the user to repeat themselves.

So we could instead write the macro as follows:

#define TEST(test_name)                                              \
    static void CONCAT(test_function_, __LINE__)(void);              \
    __attribute__((constructor))                                     \
    static void CONCAT(register_test_, __LINE__)(void) {             \
        register_test(&CONCAT(test_function_, __LINE__), test_name); \
    }                                                                \
    static void CONCAT(test_function_, __LINE__) (void)

Aside: The CONCAT macro

The C preprocessor provides the ability to concat two identifiers together. This is done using the ## operator. There are some annoyances with it however, which I’ll go into detail here.

Firstly, we had to use the CONCAT macro in the above example, because if we would have instead written test_function_ ## __LINE__ would end up expanding to test_function___LINE__. Which would obviously not be unique for each test in the same file, and also misses the line number which we wanted.

This behaviour is well documented in this wikipedia article. But I’ll summarise here.

The C preprocessor expands macros in the following stages

Stringification operations are replaced, without performing expansion (this is the # operator which isn’t covered here)
Parameters are replaced with their replacement, without performing expansion
Concatenation operations are replaced with the concatenated result without expanding the resulting token
Tokens originating from parameters are expanded
The resulting tokens are expanded

So if we look through the list above considering test_function_ ## __LINE__, then we can see that concatenation in step 3 is performed before step 5 which would turn __LINE__ into the current line.

One way we can get around that is to create a macro which we’ll call CONCAT, which simply concatenates the arguments. However, this doesn’t work:

#define CONCAT(a, b) a ## b

Suppose you wrote CONCAT(test_function_ ## __LINE__). Let’s follow the substitution pattern above.

Initially, by rule 5, it expands to test_function_ ## __LINE__. Then, by rule 3, the two fields are concatenated. Before rule 4 where the tokens originated from parameters are expanded.

The way around this, and the way we define the CONCAT macro above is as follows

#define CONCAT2(a, b) a ## b
#define CONCAT(a, b) CONCAT2(a, b)

This seems a bit silly, but it’ll now mean that the expansion behaves as you want it to. Let’s try expanding CONCAT(test_function_, __LINE__) again. Firstly, by rule 5 this expands to CONCAT2(test_function_, __LINE__). Then, by rule 4, the tokens originating from parameters are expanded, so you get CONCAT2(test_function_, 3) (say this is on line 3). This then gets expanded by rule 5 again to test_function_ ## 3 which finally by rule 3 is concatenated to the desired test_function_3.

Conclusion

You can create a really simple test framework in C using the macro

#define CONCAT2(a, b) a ## b
#define CONCAT(a, b) CONCAT2(a, b)

#define TEST(test_name)                                              \
    static void CONCAT(test_function_, __LINE__)(void);              \
    __attribute__((constructor))                                     \
    static void CONCAT(register_test_, __LINE__)(void) {             \
        register_test(&CONCAT(test_function_, __LINE__), test_name); \
    }                                                                \
    static void CONCAT(test_function_, __LINE__) (void)

and some simple helper functions.

It would be reasonably easy to extend the runner to, for example, only run tests who’s names matched command line arguments. I hope this gives you some ideas as to how you can metaprogram in C in ways you never thought were possible. And maybe persuades you to write a few unit tests for your project that doesn’t currently have any.

Stay tuned for the next two parts of this series!