Gwilym’s blog

Hopefully someone finds something useful here

The C preprocessor is awesome part II

This is the second part of a 3 part series on why the C preprocessor is awesome and showing you some tricks you should never use. This part is going to be the shortest, showing off a really simple trick that makes your macros look like actual keywords. But this is a very important section for showing how to implement TRY, CATCH and FINALLY in the final part.

In part 1 of this series, we implemented a very nice test harness that let us write unit tests much more simply. This will build on the ideas from part 1 so I recommend reading that first if you haven’t already.

Introduction

In this installment, I’m going to run through creating a LOCK macro that works like lock in C# or synchronized in Java. Have you ever looked at those languages and wished that C could do that too? Well look no further. And also wish you’d never asked. By the end of this post, you’ll have a library that lets you use pthread mutexes as follows:

pthread_mutex_t myMutex;
pthread_mutex_init(&myMutex, NULL);
// ...
LOCK(&myMutex) {
    // do something
}

and because we’re writing C and goto fail is a thing that will never happen, we would like to be able to use our macro as if it were an actual keyword:

LOCK(&myMutex) functionCall();

and have the mutex get locked and unlocked properly.

What do we want out of this?

As with part 1, it is useful to decide what we want our macro to expand to in the end. In this case, we would want the above to expand to something like:

pthread_mutex_t myMutex;
pthread_mutex_init(&myMutex, NULL);
// ...
pthread_mutex_lock(&myMutex);
{
    // do something
}
pthread_mutex_unlock(&myMutex);

which leads us to a small problem. In our LOCK usage, the macro is at the beginning and we want code to run both at the start and at the end of our block. Turns out, there is a really simple way we can get this to work.

One small trick

Let’s try rewriting the above in a way that { // do something } is at the end.

for (int i = 0; i < 3; i++) {
    if (i == 0) {
        pthread_mutex_lock(&myMutex);
    } else if (i == 2) {
        pthread_mutex_unlock(&myMutex);
    } else {
        // do something
    }
}

and with a careful bit of bracket removal, we have our LOCK macro:

#define LOCK(mutex)                            \
    for (int i__ = 0; i__ < 3; i__++)          \
        if (i__ == 0) {                        \
            pthread_mutex_lock(mutex);        \
        } else if (i__ == 2) {                 \
            pthread_mutex_unlock(mutex);      \
        } else

It is almost too good to be true. You can use this with or without brackets, so both of the above examples of how we’d want to use this macro work perfectly. And both clang and gcc will completely optimise the loop away entirely! But there are some issues you need to consider (and why you should never use this in production).

Issues

Lets start with an unfortunate problem here. You cannot return inside the block following LOCK. That would cause i = 2 to never happen so the mutex is never unlocked, which is definitely a bad thing.

Similarly, a break statement or continue statement inside the block following LOCK would effect the for loop defined in the LOCK macro rather than the outer statements. For example, the following will execute both // do something and // do something else and leave myMutex locked!

switch (i) {
    case 1:
        LOCK(&myMutex) {
            // do something
            break;
        }
    case 2:
        // do something else
        break;
}

Given that I recommend never using this in production anyway, it doesn’t matter too much. But we should try to make these compiler errors!

Nested functions

Let’s dive off ISO C spec once again and delve into the wonderful world of GCC extensions. We’re going to need the CONCAT macro from last time, and we’ll use a feature called nested functions. This lets you define a function inside another function. And the nested function can access all the variable of the containing function up to the point of its definition. So in some ways, it is almost indistinguishable from a block! Except that you can’t continue or break on the outer scope.

A useless example of how they can be used is as follows:

int main() {
    int returnTwo(void) { return 2; }

    return returnTwo();
}

One thing to note is that if you need to declare the nested function before its definition, you need to use auto, so as follows:

int main() {
    auto int returnTwo(void);
    
    int i = returnTwo();
    int returnTwo(void) { return 2; }

    return i;
}

That’s all you need to know about nested functions before we show the final macro it all its glory!

LOCK with nested functions

This unfortunately loses the ability to use the lock statement without braces. But given my preferences to always include braces, this isn’t a massive downside to me.

I think this is easiest to just show the final macro and explain it afterwards rather than build up to it

#define LOCK(mutex)                                         \
    auto void CONCAT(lockNestedFunction__, __LINE__)(void); \
    pthread_mutex_lock(mutex);                              \
    CONCAT(lockNestedFunction__, __LINE__)();               \
    pthread_mutex_unlock(mutex);                            \
    void CONCAT(lockNestedFunction__, __LINE__)(void)

This looks very different to before!

Similar to the previous part, we need a unique name for each function, so using __LINE__ will at least mean you can use a LOCK once per line. We lock the mutex, call the newly defined nested function and unlock the mutex on the other side. The final line starts the definition of the nested function, but leaves it up to the user of the macro to add the braces.

This has a few advantages over the previous version of this macro with the for loop. Trying to compile the switch example above results in the compiler error of:

error: a label can only be part of a statement and a declaration is not a statement
     auto void CONCAT(lockNestedFunction__, __LINE__)(void); \

hmm… Not quite the error we expected.

This is a fun side effect of labels in C. A label must precede a statement in C. There is an easy fix for this. An empty statement counts as a statement, so we can either fix our macro by starting it with a ; or put a ; after the : in the case statement.

After doing either of those (at this point, we’re so far from doing something sensible that either is okay), we get the compiler error:

file.c: In function ‘lockNestedFunction__33’:
file.c:36:13: error: break statement not within loop or switch
             break;
             ^~~~~

which is what we wanted!

Unfortunately this doesn’t fix the return issue from before, but at least it won’t leave the mutex permanently locked. And if you try and return something that isn’t void, you’ll get a compiler error.

Conclusion

NEVER DO THIS!

The C test framework from part I could be considered vaguely sensible. Especially if it convinced people to write unit tests in their C programs that they wouldn’t have before. But this is so full of issues, for example which key words can’t be used within the block, that you’ll end up with something hard to debug in the future. However, this was a fun look into some silly tricks you can use to get some code defined after the macro to run in the middle of the macro, which will be a very useful idea for part III of this series.

Stay tuned!