[PATCH] D26114: [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 18 09:58:04 PST 2016
rampitec added a comment.
In https://reviews.llvm.org/D26114#599905, @nhaehnle wrote:
> This seems correct to me.
>
> It could be quite beneficial to have a general pass running quite late that optimizes away s[i:i+1] & EXEC instructions. This would allow lowering PHIs of i1 as straightforward and-with-exec in the predecessor blocks + bitwise-or in the block containing the PHI, and it would help with some of the WholeQuadMode changes that I still need to get around to.
I'm thinking of two places to perform such optimization: SIFoldOperands.cpp and SIOptimizeExecMasking.cpp.
The first is preferable because that happens before register allocation so we may save a pair of SGPRs in a good case. The other may be also beneficial because PHIs are eliminated post RA. At the end of the day both may be needed.
Repository:
rL LLVM
https://reviews.llvm.org/D26114
More information about the llvm-commits
mailing list