[PATCH] D18162: AMDGPU: Add SIWholeQuadMode pass

Nicolai Hähnle via llvm-commits llvm-commits at lists.llvm.org
Sat Mar 19 07:29:21 PDT 2016

nhaehnle added a comment.

As an addendum: In part, correctness depends on the guarantees for derivatives that are required by GLSL. For example, if you have:

%tmp = call <2 x i32> @llvm.amdgcn.image.load.v2i32(...)
 %coords = bitcast and extract from %tmp
 br i1 %cc, label %IF, label %ELSE

 %texel = call <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32> %coord, ...)



  ... %coord not used here or later ...

The derivative taken by the llvm.SI.image.sample is undefined in GLSL if the control-flow is dynamically non-uniform, so it is perfectly legal to sink the llvm.amdgcn.image.load into the IF block (and the same applies to any other computation that leads to a derivative).


More information about the llvm-commits mailing list