[PATCH] D129073: [AMDGPU] Combine s_or_saveexec, s_xor instructions.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 7 07:27:27 PDT 2022


tsymalla added a comment.

In D129073#3635410 <https://reviews.llvm.org/D129073#3635410>, @nhaehnle wrote:

> This is a good start. However, I have some high-level questions:
>
> 1. Scanning over entire basic blocks is bad for compile times, and this pass is already doing some scans. For example, `optimizeExecSequence` already scans for copies to exec. Can this be improved? Notice how the scan in `optimizeExecSequence` goes backwards in the basic block and limits itself to only a small number of instruction. I could imagine a restructuring of the pass so that every basic block is scanned backwards for an EXEC-writing instruction. Depending on what the instruction is (copy-to-exec, s_and_saveexec, s_xor) one of the optimizations can be applied.

Good idea. I will try to reuse the existing scan to inject the optimizations, if possible.

> 2. Why is this change done in SIOptimizeExecMasking instead of SIOptimizeExecMaskingPreRA? Actually, I don't remember why we have the two passes in the first place. Perhaps @rampitec remembers?

I don't know what exactly the reasons are to have two passes. However, as in SIOptimizeExecMasking `S_*BINOP*_{B32, B64}` instructions are swapped with their SAVEEXEC counterpart, it made sense to me to introduce the change here. This is the last time such instructions are inserted by a pass, so it's likely the pattern I am trying to match will only appear after SIOptimizeExecMasking has run. But please correct me if I'm wrong.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129073/new/

https://reviews.llvm.org/D129073



More information about the llvm-commits mailing list