C++17 constexpr everything (or as much as the compiler can)
Posted on December 27, 2017 by Paul
During the holidays I did some catch up with CppCon 2017. One of the titles that I had on my to watch list for a few months now was constexpr ALL the Things! by Bean Deane and Jason Turner. Please note that I wrote most of this article before actually watching the presentation.
The title of the presentation made me curious if I can optimize an old piece of code that used a huge 2D array of coefficients as the initial condition for a long calculation. In order to avoid recalculating the big array of coefficients, I used to keep them in a file and simply load the data in memory every time the code was executed. The promise of using a constexpr was that I could avoid keeping two executables (the code that generated the coefficients and the code that did the actual work) and a data file. Replacing everything with a single binary was interesting and could potentially be faster.
In order to test the above, I devised a simpler model - fill an array with data generated at compile time.
Let’s start with a simple example that calculates the factorial of a number recursively (my old code used a recursive relation to generate the coefficients, so this was pretty close to what I needed):
If you compile the above code with GCC 7.2 or Clang 5, the compiler will replace the call to factorial(5) with 120 at compile time, as expected. You can see the complete assembly code generated by both compilers. Here is a snippet of the assembly code generated by GCC:
A slightly more complicated example is to calculate the Fibonacci numbers using a constexpr function:
GCC 7.2 will replace the call to fibonacci(10) with 55, as expected. Turns out that (for this particular usage) Clang 5 can’t calculate the recursive Fibonacci function at compile time, you can see the complete generated assembly. Here is a snippet of the assembly code generated by Clang:
This is not good! My original code has a more complicated recursive relation than the above Fibonacci implementation.
You can force Clang to behave by saving the value first:
You can see the generated assembly for the above code here.
Noe, let’s try with an iterative implementation:
For the iterative implementation both compilers generate the value of the function at compile time as expected, you can see the complete generated assembly here. The call to fibonacci(10) is replaced with 55.
Here is a snippet of the assembly generated by GCC:
and for Clang:
Next problem, is how do you store a bunch of generated values in an array. If you need a small number of values, a simple approach is to directly initialize the array, e.g.:
You can see the complete generated assembly here. The code works as expected with both compilers and the values are calculated at compile time. Please note that the above example also works if you replace std::array with a C type array:
What about the case when you need to generate a large number of elements? A simple approach is to generate these values with a separate constexpr function that returns the filled array, example:
You can see the complete generated assembly here. The code works as expected with both compilers and the values are calculated at compile time. (Please note that for large Fibonacci numbers you could easily overflow).
In conclusion, at the time of this writing, you need to inspect the generated code of your compiler, if you want to be sure that something is really calculated at compile time. Alternatively, you can use static_assert to test if a particular value is calculated at compile time. Example:
If you are interested to learn more about modern C++ I would recommend reading A Tour of C++ by Bjarne Stroustrup.
or Effective Modern C++ by Scott Meyers.