20
votes

Consider the following simple (to the extent that template questions ever are) example:

#include <iostream>

template <typename T>
struct identity;

template <>
struct identity<int> {
    using type = int;
};

template<typename T> void bar(T, T ) { std::cout << "a\n"; }
template<typename T> void bar(T, typename identity<T>::type) { std::cout << "b\n"; }

int main ()
{
    bar(0, 0);
}

Both clang and gcc print "a" there. According to the rules in [temp.deduct.partial] and [temp.func.order], to determine partial ordering, we need to synthesize some unique types. So we have two attempts at deduction:

+---+-------------------------------+-------------------------------------------+
|   | Parameters                    | Arguments                                 |
+---+-------------------------------+-------------------------------------------+
| a | T, typename identity<T>::type | UniqueA, UniqueA                          |
| b | T, T                          | UniqueB, typename identity<UniqueB>::type |
+---+-------------------------------+-------------------------------------------+

For deduction on "b", according to Richard Corden's answer, the expression typename identity<UniqueB>::type is treated as a type and is not evaluated. That is, this will be synthesized as if it were:

+---+-------------------------------+--------------------+
|   | Parameters                    | Arguments          |
+---+-------------------------------+--------------------+
| a | T, typename identity<T>::type | UniqueA, UniqueA   |
| b | T, T                          | UniqueB, UniqueB_2 |
+---+-------------------------------+--------------------+

It's clear that deduction on "b" fails. Those are two different types so you cannot deduce T to both of them.

However, it seems to me that the deduction on A should fail. For the first argument, you'd match T == UniqueA. The second argument is a non-deduced context - so wouldn't that deduction succeed iff UniqueA were convertible to identity<UniqueA>::type? The latter is a substitution failure, so I don't see how this deduction could succeed either.

How and why do gcc and clang prefer the "a" overload in this scenario?

2
stackoverflow.com/a/1182688/4326278 contains relevant info, I think.bogdan
@bogdan That makes sense from the perspective of just treatying typename identity<UniqueB>::type as UniqueB_2. But why would the deduction on "a" succeed? I reworded the question somewhat.Barry
As far as I can tell, that applies not only to generating the unique arguments, but also to adjusting the parameters: T, typename identity<T>::type becomes T, U, so deduction succeeds. The algorithm, however, clearly does a few more things that aren't specified in the standard either. In this case, it 'remembers' that T is actually used within the construct that U replaced so that, at the end of the deduction process, if T hasn't been deduced from somewhere else, deduction still fails overall (there's a hint at this in [14.8.2.4p11]).bogdan
That's why deduction fails in the example from the other question, using only one parameter (bar(T) vs. bar(typename identity<T>::type)). As I said, the whole thing is poorly specified. Issue 1391 will likely change the treatment of non-deduced contexts and make things better specified, although still far from perfect.bogdan
On a side note, another case that's not specified is why template<typename T> void f(T, T) is more specialized than template<typename T, typename U> void f(T, U). Nowhere does the standard say that the deductions for different pairs of types need to be consistent during partial ordering; the algorithm as specified only deals with the pairs of types in isolation. My version: for the parameters for which the function call has arguments, deduction must succeed for a synthesized function call using the invented args from A against a synthesized template declaration using the params from P.bogdan

2 Answers

23
votes

As discussed in the comments, I believe there are several aspects of the function template partial ordering algorithm that are unclear or not specified at all in the standard, and this shows in your example.

To make things even more interesting, MSVC (I tested 12 and 14) rejects the call as ambiguous. I don't think there's anything in the standard to conclusively prove which compiler is right, but I think I might have a clue about where the difference comes from; there's a note about that below.

Your question (and this one) challenged me to do some more investigation into how things work. I decided to write this answer not because I consider it authoritative, but rather to organize the information I have found in one place (it wouldn't fit in comments). I hope it will be useful.


First, the proposed resolution for issue 1391. We discussed it extensively in comments and chat. I think that, while it does provide some clarification, it also introduces some issues. It changes [14.8.2.4p4] to (new text in bold):

Each type nominated above from the parameter template and the corresponding type from the argument template are used as the types of P and A. If a particular P contains no template-parameters that participate in template argument deduction, that P is not used to determine the ordering.

Not a good idea in my opinion, for several reasons:

  • If P is non-dependent, it doesn't contain any template parameters at all, so it doesn't contain any that participate in argument deduction either, which would make the bold statement apply to it. However, that would make template<class T> f(T, int) and template<class T, class U> f(T, U) unordered, which doesn't make sense. This is arguably a matter of interpretation of the wording, but it could cause confusion.
  • It messes with the notion of used to determine the ordering, which affects [14.8.2.4p11]. This makes template<class T> void f(T) and template<class T> void f(typename A<T>::a) unordered (deduction succeeds from first to second, because T is not used in a type used for partial ordering according to the new rule, so it can remain without a value). Currently, all compilers I've tested report the second as more specialized.
  • It would make #2 more specialized than #1 in the following example:

    #include <iostream>
    
    template<class T> struct A { using a = T; };
    
    struct D { };
    template<class T> struct B { B() = default; B(D) { } };
    template<class T> struct C { C() = default; C(D) { } };
    
    template<class T> void f(T, B<T>) { std::cout << "#1\n"; } // #1
    template<class T> void f(T, C<typename A<T>::a>) { std::cout << "#2\n"; } // #2
    
    int main()
    {
       f<int>(1, D());
    }
    

    (#2's second parameter is not used for partial ordering, so deduction succeeds from #1 to #2 but not the other way around). Currently, the call is ambiguous, and should arguably remain so.


After looking at Clang's implementation of the partial ordering algorithm, here's how I think the standard text could be changed to reflect what actually happens.

Leave [p4] as it is and add the following between [p8] and [p9]:

For a P / A pair:

  • If P is non-dependent, deduction is considered successful if and only if P and A are the same type.
  • Substitution of deduced template parameters into the non-deduced contexts appearing in P is not performed and does not affect the outcome of the deduction process.
  • If template argument values are successfully deduced for all template parameters of P except the ones that appear only in non-deduced contexts, then deduction is considered successful (even if some parameters used in P remain without a value at the end of the deduction process for that particular P / A pair).

Notes:

  • About the second bullet point: [14.8.2.5p1] talks about finding template argument values that will make P, after substitution of the deduced values (call it the deduced A), compatible with A. This could cause confusion about what actually happens during partial ordering; there's no substitution going on.
  • MSVC doesn't seem to implement the third bullet point in some cases. See the next section for details.
  • The second and third bullet points are intented to also cover cases where P has forms like A<T, typename U::b>, which aren't covered by the wording in issue 1391.

Change the current [p10] to:

Function template F is at least as specialized as function template G if and only if:

  • for each pair of types used to determine the ordering, the type from F is at least as specialized as the type from G, and,
  • when performing deduction using the transformed F as the argument template and G as the parameter template, after deduction is done for all pairs of types, all template parameters used in the types from G that are used to determine the ordering have values, and those values are consistent across all pairs of types.

F is more specialized than G if F is at least as specialized as G and G is not at least as specialized as F.

Make the entire current [p11] a note.

(The note added by the resolution of 1391 to [14.8.2.5p4] needs to be adjusted as well - it's fine for [14.8.2.1], but not for [14.8.2.4].)


For MSVC, in some cases, it looks like all template parameters in P need to receive values during deduction for that specific P / A pair in order for deduction to succeed from A to P. I think this could be what causes implementation divergence in your example and others, but I've seen at least one case where the above doesn't seem to apply, so I'm not sure what to believe.

Another example where the statement above does seem to apply: changing template<typename T> void bar(T, T) to template<typename T, typename U> void bar(T, U) in your example swaps results around: the call is ambiguous in Clang and GCC, but resolves to b in MSVC.

One example where it doesn't:

#include <iostream>

template<class T> struct A { using a = T; };
template<class, class> struct B { };

template<class T, class U> void f(B<U, T>) { std::cout << "#1\n"; }
template<class T, class U> void f(B<U, typename A<T>::a>) { std::cout << "#2\n"; }

int main()
{
   f<int>(B<int, int>());
}

This selects #2 in Clang and GCC, as expected, but MSVC rejects the call as ambiguous; no idea why.


The partial ordering algorithm as described in the standard speaks of synthesizing a unique type, value, or class template in order to generate the arguments. Clang manages that by... not synthesizing anything. It just uses the original forms of the dependent types (as declared) and matches them both ways. This makes sense, as substituting the synthesized types doesn't add any new information. It can't change the forms of the A types, since there's generally no way to tell what concrete types the substituted forms could resolve to. The synthesized types are unknown, which makes them pretty similar to template parameters.

When encountering a P that is a non-deduced context, Clang's template argument deduction algorithm simply skips it, by returning "success" for that particular step. This happens not only during partial ordering, but for all types of deductions, and not just at the top level in a function parameter list, but recursively whenever a non-deduced context is encountered in the form of a compound type. For some reason, I found that surprising the first time I saw it. Thinking about it, it does, of course, make sense, and is according to the standard ([...] does not participate in type deduction [...] in [14.8.2.5p4]).

This is consistent with Richard Corden's comments to his answer, but I had to actually see the compiler code to understand all the implications (not a fault of his answer, but rather of my own - programmer thinking in code and all that).

I've included some more information about Clang's implementation in this answer.

5
votes

I believe the key is with the following statement:

The second argument is a non-deduced context - so wouldn't that deduction succeed iff UniqueA were convertible to identity::type?

Type deduction does not perform checking of "conversions". Those checks take place using the real explicit and deduced arguments as part of overload resolution.

This is my summary of the steps that are taken to select the function template to call (all references taken from N3937, ~ C++ '14):

  1. Explicit arguments are replaced and the resulting function type checked that it is valid. (14.8.2/2)
  2. Type deduction is performed and the resulting deduced arguments are replaced. Again the resulting type must be valid. (14.8.2/5)
  3. The function templates that succeeded in Steps 1 and 2 are specialized and included in the overload set for overload resolution. (14.8.3/1)
  4. Conversion sequences are compared by overload resolution. (13.3.3)
  5. If the conversion sequences of two function specializations are not 'better' the partial ordering algorithm is used to find the more specialized function template. (13.3.3)
  6. The partial ordering algorithm checks only that type deduction succeeds. (14.5.6.2/2)

The compiler already knows by step 4 that both specializations can be called when the real arguments are used. Steps 5 and 6 are being used to determine which of the functions is more specialized.