2
votes

Attempting to implement a pleasing (simple, straightforward, no TMP, no macros, no unreadable convoluted code, no weird syntax when using it) compile-time hash via user-defined literals, I found that apparently GCC's understanding of what's a constant expression is grossly different from my understanding.

Since code and compiler output say more than a thousand words, without further ado:

#include <cstdio>

constexpr unsigned int operator"" _djb(const char* const str, unsigned int len)
{
    static_assert(__builtin_constant_p(str), "huh?");
    return len ? str[0] + (33 * ::operator"" _djb(str+1, len-1)) : 5381;
}

int main()
{
    printf("%u\n", "blah"_djb);
    return 0;
}

The code is pretty straightforward, not much to explain, and not much to ask about -- except it does not evaluate at compile-time. I tried using a pointer dereference instead of using an array index as well as having the recursion break at !*str, all to the same result.

The static_assert was added later when fishing in troubled waters for why the hash just wouldn't evaluate at compile-time when I firmly believed it should. Well, surprise, that only puzzled me more, but didn't clear up anything! The original code, without the static_assert, is well-accepted and compiles without warnings (gcc 4.7.2).

Compiler output :

[...]\main.cpp: In function 'constexpr unsigned int operator"" _djb(const char*, unsigned int)':
[...]\main.cpp:5:2: error: static assertion failed: huh?

My understanding is that a string literal is, well... a literal. In other words, a compile-time constant. Specifically, it is a compiletime-known sequence of constant characters starting at a constant address assigned by the compiler (and thus, known) terminated by '\0'. This logically implies that the literal's compiler-calculated length as supplied to operator"" is a constexpr as well.

Also, my understanding is that calling a constexpr function with only compile-time parameters makes it elegible as initializer for an enumeration or as template parameter, in other words it should result in evaluation at compile time.
Of course it is in principle always allowable for the compiler to evaluate a constexpr function at runtime, but being able to move the evaluation to compile-time is the entire point of having constexpr, after all.

Where is my fallacy, and is there a way of implementing a user-defined literal that can take a string literal so it actually evaluates at compile-time?

Possibly relevant similar questions:
Can a string literal be subscripted in a constant expression?
User defined literal arguments are not constexpr?
The first one seems to suggest that at least for char const (&str)[N] this works, and GCC accepts it, though I admittedly can't follow the conclusion.
The second one uses integer literals, not string literals, and finally addresses the issue by using template metaprogramming (which I don't want). So apparently the issue is not limited to string literals?

1
@R.MartinhoFernandes: Thank you for the std::size_t pointer. So if it works fine with 4.7.3 and 4.8, it seems to be a gcc 4.7.2 limitation. Indeed, if I attempt to use this for an enum, I get "expected identifier before 'STRING_USERDEF' token", so apparently my version of GCC doesn't choose not to evaluate at compile-time, but it simply can't. Add your findings as answer if you please, I'd accept it as "must upgrade compiler" (meh, compile gcc...).Damon
Ok, I added an answer with my findings and a bit of information about __builtin_constant_p as well. I'm deleting the comments now since they are now redundant.R. Martinho Fernandes
@Damon you can find an interesting use case for __builtin_constant_p here and some of the background on the feature.Shafik Yaghmour

1 Answers

3
votes

I don't have GCC 4.7.2 at hand to try, but your code without the static assertion (more on that later) compiles fine and executes the function at compile-time with both GCC 4.7.3 and GCC 4.8. I guess you will have to update your compiler.

The compiler is not always allowed to move the evaluation to runtime: some contexts, like template arguments, and static_assert, require evaluation at compile-time or an error if not possible. If you use your UDL in a static_assert you will force the compiler to evaluate it at compile-time if possible. In both my tests it does so.

Now, on to __builtin_constant_p(str). To start with, as documented, __builtin_constant_p can produce false negatives (i.e. it can return 0 for constant expressions sometimes).

str is not provably a constant expression because it is a function argument. You can force the compiler to evaluate the function at compile-time in some contexts, but that doesn't mean it can never evaluate it at runtime: some contexts never force compile-time evaluation (and in fact, in some of those contexts compile-time evaluation is just impossible). str can be a non-constant expression.

The static assertions are tested when the compiler sees the function, not once for each call the compiler sees. That makes the fact that you always call it in compile-time contexts irrelevant: only the body matters. Because str can sometimes be a non-constant expression, __builtin_constant_p(str) in that context cannot be true: it can produce false negatives, but it does not produce false positives.

To make it more clear: static_assert(__builtin_constant_p("blah"), "") will pass (well, in theory it could be fail, but I doubt the compiler would produce a false negative here), because "blah" is always a constant expression, but str is not the same expression as "blah".

For completeness, if the argument in question was of a numeric type (more on that later), and you did the test outside of a static assertion, you could get the test to return true if you passed a constant, and false if you passed a non-constant. In a static assertion, it always fails.

But! The docs for __builtin_constant_p reveal one interesting detail:

However, if you use it in an inlined function and pass an argument of the function as the argument to the built-in, GCC will never return 1 when you call the inline function with a string constant or compound literal (see Compound Literals) and will not return 1 when you pass a constant numeric value to the inline function unless you specify the -O option.

As you can see the built-in has a limitation makes the test always return false if the expression given is a string constant.