[PATCH] D26114: [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 17 16:38:11 PST 2016


rampitec updated this revision to Diff 78434.
rampitec added a comment.

Previous version was reverted due to error in GL piglit test fs-discard-exit-2.

The v_cmp_* instruction does not preserve result bits for inactive lanes, but rather sets them to 0. This is in fact equivalent of EXEC[n]  & compare[n]. A corrected propagation starts not with v_cndmask_b32 which saves condition, but with a v_cmp instruction which restores it. In case if pattern is matched we can emit s_and_b32 of original scalar result with EXEC instead of v_cmp. Then the first v_cmdmask_b32 will have a chance to be deadcoded.

The next step (in a separate change) will be to combine newly created s_and_b32 with the following s_and_saveexec_b64 if any.


Repository:
  rL LLVM

https://reviews.llvm.org/D26114

Files:
  lib/Target/AMDGPU/AMDGPUISelLowering.cpp
  lib/Target/AMDGPU/SILowerI1Copies.cpp
  test/CodeGen/AMDGPU/branch-relaxation.ll
  test/CodeGen/AMDGPU/hoist-cond.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26114.78434.patch
Type: text/x-patch
Size: 5190 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161118/ece5ee8c/attachment.bin>


More information about the llvm-commits mailing list