[PATCH] D26114: [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 18 09:58:04 PST 2016


rampitec added a comment.

In https://reviews.llvm.org/D26114#599905, @nhaehnle wrote:

> This seems correct to me.
>
> It could be quite beneficial to have a general pass running quite late that optimizes away s[i:i+1] & EXEC instructions. This would allow lowering PHIs of i1 as straightforward and-with-exec in the predecessor blocks + bitwise-or in the block containing the PHI, and it would help with some of the WholeQuadMode changes that I still need to get around to.


I'm thinking of two places to perform such optimization: SIFoldOperands.cpp and SIOptimizeExecMasking.cpp.

The first is preferable because that happens before register allocation so we may save a pair of SGPRs in a good case. The other may be also beneficial because PHIs are eliminated post RA. At the end of the day both may be needed.


Repository:
  rL LLVM

https://reviews.llvm.org/D26114





More information about the llvm-commits mailing list