Given the following shader:

struct Contents {
  int X;
};

RWStructuredBuffer<Contents> Values;

[numthreads(4, 1, 1)]
void main(uint3 TID : SV_DispatchThreadID) {
  uint Sum = 0;
  switch (Values[TID.x].X) {
    case 0:
      Sum += WaveActiveSum(1);
    default:
      Sum += WaveActiveSum(10);
      break;
  }
  Values[TID.x].X = Sum;
}

If the buffer Values refers to is initialized to [ 0, 0, 1, 2 ], what is the buffer’s value when the shader completes?

A. [ 42, 42, 40, 40 ]

B. [ 22, 22, 20, 20 ]

C. [ 22, 22, 10, 10 ]

D. Undefined!

Answer!

Trick question!

On DirectX, this is intended to be well-defined to A, but the specification language is unclear. The documentation states:

These intrinsics are dependent on active lanes and therefore flow control. In the model of this document, implementations must enforce that the number of active lanes exactly corresponds to the programmer’s view of flow control.

There are bugs in drivers that cause this to not always be the case is it was not rigorously tested in the HLK tests.

In SPIR-V the OpSwitch instruction’s convergence behavior on switch fall through cases is undefined, which would make this code undefined if it lowers to SPIR-V’s OpSwitch.

The HLSL team is tracking bugs on both DXC and Clang to avoid the use of OpSwitch:

Similarly the Slang compiler is tracking this issue as well.

A second example that becomes even more problematic is something like:

struct Contents {
  int X;
};

RWStructuredBuffer<Contents> Values;

groupshared int Reduction;

[numthreads(4, 1, 1)]
void main(uint3 TID : SV_DispatchThreadID) {
  if (WaveIsFirstLane())
    Reduction = 0;

  switch (Values[TID.x].X) {
    case 0:
      Reduction += WaveActiveSum(1);
    default:
      Reduction += WaveActiveSum(Reduction);
      GroupMemoryBarrierWithGroupSync();
      break;
  }
  Values[TID.x].X = Reduction;
}

In this case under SPIRV, even though all threads enter the default label, control flow is not guaranteed to be uniform. This means that the group barrier’s behavior is undefined and may cause the shader to deadlock or terminate unexpectedly.