Gwilym’s blog

Hopefully someone finds something useful here

The C preprocessor is awesome part III

This is the third part of a 3 part series on why the C preprocessor is awesome and showing you some tricks you should never use. In this part, we’re finally going to implement the TRY, CATCH and FINALLY macros I’ve been promising you since the beginning.

In part 1 of this series, we implemented a very nice test harness that let us write unit tests much more simply. In part 2, we implemented a LOCK macro for making it ‘easier’ to use mutexes. This will build on the ideas from both parts, so I recommend reading them first if you haven’t already.

Introduction

As you are probably aware, C doesn’t really do error handling. Depending on the function you’re calling can vastly change how you do the error handling. For example, some will return a non-zero value on error. Some will return a zero value on error. Some will not return anything and instead set errno.

In this post we’re going to add yet another way to signal errors! Exactly what everyone needs.

This would have been much harder without some prior art which I started with. The initial implementation came from here but this didn’t handle all the cases I wanted. I wanted the ability to catch all exceptions as well, so I implemented a solution which allowed either CATCH_ALL or FINALLY or neither. However, I was never happy with this solution so in the end I now have a mostly original implementation using hacks as never before seen (or at least I’ve never seen them before).

Another limitation of the approach above is it didn’t really handle calling other functions that throw exceptions. And what if you wanted to nest your exceptions a few layers deep?

Aim

The end goal here is to have an ergonomic exception system. Complete with ‘unwinding’ and no need to check statuses of every method in a slightly different way.

In the end, we’d like to be able to write code such as:

TRY {
    doSomethingDangerous();
} CATCH(SOME_EXCEPTION) {
    // this will only run if doSomethingDangerous throws SOME_EXCEPTION
} FINALLY {
    // this code will always run
}

Exceptions can be thrown using the THROW macro.

We’d also like to be able to return within the TRY and CATCH blocks, while still having the content of the FINALLY block run. We’ll try to add support for that too.

Unfortunately, I wrote this code in 2017 (currently almost 2021), so I’m not entirely sure how I came up with some of these hacks. I am therefore documenting how it was done, and commenting on the implementation, rather than building this up from scratch.

I recommend loading up exceptions.h from the repo you can find here and referring to it along with the article.

The general approach

At it’s heart, the exception system provides a stack of longjmp locations to jump to. When we get to a TRY block, we push onto the stack of exception handling places, and when THROW gets called we longjmp back to the most recent TRY block. However, a few hacks were needed to get the ergonomics that we’re used to in other languages.

The TRY macro

Let’s have a closer look at the TRY block (this is the most complicated since it contains the code for all the other cases, but worth understanding how it works).

#define TRY                                                           \
    for (TryData__ tryData__ =                                        \
             {setjmp(exceptionStack__[try__(__FILE__, __LINE__)]), 0, \
              (void *)0, (void *)0};                                  \
         tryData__.runFourTimes <= 3; tryData__.runFourTimes++)       \
        if (tryData__.runFourTimes == 0)                              \
        {                                                             \
            __label__ continueLabel;                                  \
            tryData__.continueLabel = &&continueLabel;                \
        continueLabel:;                                               \
        }                                                             \
        else if (tryData__.runFourTimes == 3)                         \
        {                                                             \
            if (!catchHandled__())                                    \
            {                                                         \
                RETHROW;                                              \
            }                                                         \
            endTry__();                                               \
            if (tryData__.returnTo)                                   \
            {                                                         \
                goto *tryData__.returnTo;                             \
            }                                                         \
        }                                                             \
        else if (tryData__.runFourTimes == 1 && tryData__.tryAttempt == 0)

We have defined above an exceptionStack__ of jmp_bufs. This has some defined size, and externally we keep track of the top of the stack. The try__ method returns the current index of the top (and prints and terminates if the call stack is exceeded). It keeps track of the current file and line number to generate nice stack traces (thought they were just for java did you?).

We now save the current location to the stack. TryData is a simple struct defined as follows:

typedef struct
{
    int tryAttempt;
    int runFourTimes;
    void *returnTo;
    void *continueLabel;
} TryData__;

This is used because we can only define a single variable in the initialiser of a for loop. So if I could, these would be defined as individual local variables. For the rest of this article, I’ll be referring to these as local variables, because that’s what they really are. Importantly, when we get longjmpd back to where we setjmpd, this is re-initialised to the 0 state. Except tryAttempt which set to 0 the first time, and to something non-zero by the longjmp in the case of an exception.

runFourTimes is incredibly well named. This is our loop counter, and surprisingly, counts from 0 to 4 to cover all the cases (see the LOCK macro implementation in part 2 of this series for a better explanation of why this is needed).

There is some massive abuse of the gcc extension local labels. We’re going to need to be able to jump out of the current block later, so just keep in mind the runFourTimes == 0 case. Let’s skip the runFourTimes == 3 case for now.

The final case, when runFourTimes is 1 and tryAttempt is 0. Because tryAttempt is 0, we haven’t been longjmpd to, and because runFourTimes is 1, the initialisation code has run.

In this case, we finally execute the code inside the TRY block.

The CATCH macro

#define CATCH(value)                                                                                    \
    else if (tryData__.runFourTimes == 1 && tryData__.tryAttempt == (value) && (catchHandled__() || 1))

This is reasonably simple. We check that runFourTimes is 1 (so just after the initialisation has run), that tryAttempt is equal to the exception value we care about and then a slightly confusing call to catchHandled__() || 1.

catchHandled__() is a call that informs the exception system that we’ve actually handled this exception. Since all the previous conditions passed, we want to call this state mutating function. However, we can’t actually put that inside the if statement, because then it wouldn’t be able to follow the CATCH with some braces the way we want to. So by relying on the short circuiting behaviour of &&, we only call catchHandled__() if we have actually got a matching CATCH statement. Since catchHandled__() can return both 1 and 0, and we would like to ensure that this statement is true regardless, we || this with 1 to guarantee that it is truthy.

So the content of this block will only run if we are actually catching the exception with the value we intend to, and if the currently throwing exception is handled, then the catchHandled__() function is called.

The CATCH_ALL macro

The CATCH_ALL macro is used in place of catch (Throwable e) in Java. It will catch any exception not previously caught in the chain (another reason to use else if here, only one catch statement will ever execute).

The macro is defined as follows:

#define CATCH_ALL(e)                                                         \
    else if (tryData__.runFourTimes == 1 && tryData__.tryAttempt > 0 &&      \
             (catchHandled__() ||                                            \
              1)) for (volatile Exception e = {.type = tryData__.tryAttempt, \
                                               .message = catchMessage__()}; \
                       (e).type != -1; (e).type = -1)

This similarly contains some hacks to improve ergonomics and also compiler warnings. We would like to be able to store the exception itself in some local variable, but this variable needs the correct scope. It works very similarly to the CATCH macros, except we fetch the construct the exception object. Since we need a new variable declared, using a for loop which only executes once does this perfectly. We have something silly in the test and next parts to ensure the loop runs exactly once.

The FINALLY macro

The FINALLY macro is by far the simplest.

#define FINALLY else if (tryData__.runFourTimes == 2)

runFourTimes == 2 has been reserved for this case. This will run after all CATCH blocks and the TRY block which happen on run 1.

The RETURN macro and the return of runFourTimes == 3

Let’s have a look at the RETURN macro:

#define RETURN(x)                                                              \
    do {                                                                       \
        __label__ returnPoint;                                                 \
        __auto_type retValue = (x);                                            \
        tryData__.returnTo = &&returnPoint;                                    \
        goto *tryData__.continueLabel;                                         \
    returnPoint:                                                               \
        return retValue;                                                       \
    } while (0)

Isn’t it nice to be back in the wonderful world of matching brackets, terminated statements and the like? And also a do {} while(0) loop again?

This, once again uses local labels, except this time I’ll actually explain how they’re used. Notice the __auto_type. This is equivalent to C++'s auto and Java and C#’s var. You can find __auto_type mentioned here.

But why is the RETURN macro so complicated?

We would like code in the FINALLY block to run after the code declared in x has run, but before we actually return. Mainly this useful for code like:

Resource someResource = createResource();
TRY {
   int value = doSomeCalculationWith(someResource);
   RETURN(value); 
} FINALLY {
    cleanupResource(someResource);
}

And in this case, not only will the FINALLY block run if doSomeCalculationWith throws an exception, but also just before we return value.

Unfortunately, because of our limitations with just simple text substitution with C macros, we have to put the return statement where RETURN is.

So, in this case, we set returnTo to the point in the code where we actually return, and then jump to the continueLabel. But where continueLabel?

Remember all the way back in the TRY macro?

if (tryData__.runFourTimes == 0)                              \
{                                                             \
    __label__ continueLabel;                                  \
    tryData__.continueLabel = &&continueLabel;                \
continueLabel:;                                               \
}                                                             \
else if (tryData__.runFourTimes == 3)                         \
{                                                             \
    if (!catchHandled__())                                    \
    {                                                         \
        RETHROW;                                              \
    }                                                         \
    endTry__();                                               \
    if (tryData__.returnTo)                                   \
    {                                                         \
        goto *tryData__.returnTo;                             \
    }                                                         \
} 

Continue label will take us inside this if statement, where we can then immediately jump out of and do the next stage of the loop. So now runFourTimes will be 2 (and run any FINALLY blocks) and then get to the case where runFourTimes is 3.

catchHandled__() will return 0 if it hasn’t already been called and an exception has been thrown. In this case, we should bubble the exception up the stack. endTry__() will pop the top value off the stack without jumping to it, effectively announcing that we’re done in this TRY block. And then the magic of the local labels will jump us right back to the line above the return statement and we can actually return. If we don’t want to return, returnTo will still be NULL and no jump happens.

Conclusion

This is a bit of a whirlwind tour of this code I wrote just over 3 years ago. It was too good to share, but I appologise that this probably isn’t the most coherent account of how it works.

If you have any questions or comments, feel free to contact me, and have a dig around the exception code and the enhanced test helper in the full repo.