[PATCH] D26114: [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 17 16:38:11 PST 2016
rampitec updated this revision to Diff 78434.
rampitec added a comment.
Previous version was reverted due to error in GL piglit test fs-discard-exit-2.
The v_cmp_* instruction does not preserve result bits for inactive lanes, but rather sets them to 0. This is in fact equivalent of EXEC[n] & compare[n]. A corrected propagation starts not with v_cndmask_b32 which saves condition, but with a v_cmp instruction which restores it. In case if pattern is matched we can emit s_and_b32 of original scalar result with EXEC instead of v_cmp. Then the first v_cmdmask_b32 will have a chance to be deadcoded.
The next step (in a separate change) will be to combine newly created s_and_b32 with the following s_and_saveexec_b64 if any.
Repository:
rL LLVM
https://reviews.llvm.org/D26114
Files:
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
lib/Target/AMDGPU/SILowerI1Copies.cpp
test/CodeGen/AMDGPU/branch-relaxation.ll
test/CodeGen/AMDGPU/hoist-cond.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26114.78434.patch
Type: text/x-patch
Size: 5190 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161118/ece5ee8c/attachment.bin>
More information about the llvm-commits
mailing list