The first wave of bugs relating to templates served as a great way for me to expand my understanding of Clang’s template implementation. They struck directly at two core concepts in C++ templates: template argument resolution and template instantiation.
C++ Template Type Argument Resolution
I am not a C++ expert. Reading the C++ standard’s Templates chapter makes my head hurt. The most important thing that I’ve gleaned about template argument resolution is that types are uniqued (and de-sugared), but there is no promotion applied to type arguments. To illustrate this, take the example function below:
template<typename T>
T add(T L, T R) {
return L + R;
}
Given that function, add<int16_t>
, add<int32_t>
, and add<int64_t>
are all distinctly different functions. However, assuming your platform defines both int
and long
to be 32-bit integers, add<int>
and add<long>
are the same function.
The absence of promotion applying to arguments also has a side effect that might not be expected by all users. For example, the following code fails to compile:
auto X = add(1.0f, 2.0);
This code fails because of mismatches in the types used. Since no argument promotion is performed, this tries to resolve a function that takes a float
for the first parameter and a double
for the second. Since the template declaration expects the same types for the first and second parameters, the template can’t be instantiated.
HLSL literal
types
HLSL inherits these rules in HLSL 2021, with one significant caveate. In DXC, literal values that don’t have an explicit type specifying suffix (i.e. 1.0
rather than 1.0f
or 2
rather than 2u
), are assigned a special built-in type literal float
or literal int
as appropriate.
In HLSL integer and floating point literals are defined to use high-precision values for constant calculations, but have a lower priority for promotion.
Take the following code sample:
float Fn1(float V) {
return (1.23 * M_PI) * V;
}
Following C rules, the constant expression 1.23 * PI
can be evaluated at compile time, but its result is of type double
. This causes V
to promote to a double
for its multiplication, then demotes the result to a float
to return.
In HLSL 1.23 * PI
evaluates as a literal float
, which is double-precision, but instead the result is demoted to float
to match V
. This difference in promotion rules avoids HLSL literals causing unintended promotions.
HLSL literal
types are an implementation detail of DXC, to enforce HLSL’s promotion rules. They aren’t strictly required by the language. They aren’t user specifiable and they can’t be used as a storage type.
Templates with literal
types
Because literal
types aren’t valid storage types template resolution for literal arguments gets mucky. For one thing, since a user can’t write a function that takes a literal
type parameter, literal
types aren’t handled by C++ argument dependent lookup rules (DXC has special handling for resolving literal
builtins).
This surfaces in templates, because add<float>
, add<double>
and add<literal float>
are all expected to be different template specializations since the types aren’t the same type.
When users initially attempted to use templates while deducing types from literal arguments they reported issue 3973.
The solution that I implemented for this change… makes me sad. Knowing all the things I know now I’d do it differently (and may get the chance).
What I did was promote literal
types to their corresponding 32-bit types, but the promotion occurs after argument matching.
The idea is to treat literal
types as different from the normal built in types so that if there is an ambiguity in a match, compilation fails. But if a template resolves an argument to a literal
type successfully, we substitute the appropriate 32-bit type.
The simple big thing I should have done differently in this change is floating-point types should promote to double
not float
. This would more closely align with C++.
The downside to that is it would force us to promote to double
in places where users might not expect it. Figuring out the best long-term solution to this problem is likely going to involve more changes to HLSL’s type system, but for now I’ll leave it as it is and revisit this when I have more time.