The first wave of bugs relating to templates served as a great way for me to expand my understanding of Clang’s template implementation. They struck directly at two core concepts in C++ templates: template argument resolution and template instantiation.

C++ Template Type Argument Resolution

I am not a C++ expert. Reading the C++ standard’s Templates chapter makes my head hurt. The most important thing that I’ve gleaned about template argument resolution is that types are uniqued (and de-sugared), but there is no promotion applied to type arguments. To illustrate this, take the example function below:

template<typename T>
T add(T L, T R) {
  return L + R;
}

Given that function, add<int16_t>, add<int32_t>, and add<int64_t> are all distinctly different functions. However, assuming your platform defines both int and long to be 32-bit integers, add<int> and add<long> are the same function.

The absence of promotion applying to arguments also has a side effect that might not be expected by all users. For example, the following code fails to compile:

auto X = add(1.0f, 2.0);

This code fails because of mismatches in the types used. Since no argument promotion is performed, this tries to resolve a function that takes a float for the first parameter and a double for the second. Since the template declaration expects the same types for the first and second parameters, the template can’t be instantiated.

HLSL literal types

HLSL inherits these rules in HLSL 2021, with one significant caveate. In DXC, literal values that don’t have an explicit type specifying suffix (i.e. 1.0 rather than 1.0f or 2 rather than 2u), are assigned a special built-in type literal float or literal int as appropriate.

In HLSL integer and floating point literals are defined to use high-precision values for constant calculations, but have a lower priority for promotion.

Take the following code sample:

float Fn1(float V) {
  return (1.23 * M_PI) * V;
}

Following C rules, the constant expression 1.23 * PI‌ can be evaluated at compile time, but its result is of type double. This causes V to promote to a double for its multiplication, then demotes the result to a float to return.

In HLSL 1.23 * PI‌ evaluates as a literal float, which is double-precision, but instead the result is demoted to float to match V. This difference in promotion rules avoids HLSL literals causing unintended promotions.

HLSL literal types are an implementation detail of DXC, to enforce HLSL’s promotion rules. They aren’t strictly required by the language. They aren’t user specifiable and they can’t be used as a storage type.

Templates with literal types

Because literal types aren’t valid storage types template resolution for literal arguments gets mucky. For one thing, since a user can’t write a function that takes a literal type parameter, literal types aren’t handled by C++ argument dependent lookup rules (DXC has special handling for resolving literal builtins).

This surfaces in templates, because add<float>, add<double> and add<literal float> are all expected to be different template specializations since the types aren’t the same type.

When users initially attempted to use templates while deducing types from literal arguments they reported issue 3973.

The solution that I implemented for this change… makes me sad. Knowing all the things I know now I’d do it differently (and may get the chance).

What I did was promote literal types to their corresponding 32-bit types, but the promotion occurs after argument matching.

The idea is to treat literal types as different from the normal built in types so that if there is an ambiguity in a match, compilation fails. But if a template resolves an argument to a literal type successfully, we substitute the appropriate 32-bit type.

The simple big thing I should have done differently in this change is floating-point types should promote to double not float. This would more closely align with C++.

The downside to that is it would force us to promote to double in places where users might not expect it. Figuring out the best long-term solution to this problem is likely going to involve more changes to HLSL’s type system, but for now I’ll leave it as it is and revisit this when I have more time.