[PATCH] D18162: AMDGPU: Add SIWholeQuadMode pass
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Sat Mar 19 07:29:21 PDT 2016
nhaehnle added a comment.
As an addendum: In part, correctness depends on the guarantees for derivatives that are required by GLSL. For example, if you have:
%tmp = call <2 x i32> @llvm.amdgcn.image.load.v2i32(...)
%coords = bitcast and extract from %tmp
br i1 %cc, label %IF, label %ELSE
%texel = call <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32> %coord, ...)
... %coord not used here or later ...
The derivative taken by the llvm.SI.image.sample is undefined in GLSL if the control-flow is dynamically non-uniform, so it is perfectly legal to sink the llvm.amdgcn.image.load into the IF block (and the same applies to any other computation that leads to a derivative).
More information about the llvm-commits