[PATCH] D129073: [AMDGPU] Combine s_or_saveexec, s_xor instructions.
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 7 05:02:27 PDT 2022
nhaehnle added a subscriber: rampitec.
nhaehnle added a comment.
This is a good start. However, I have some high-level questions:
1. Scanning over entire basic blocks is bad for compile times, and this pass is already doing some scans. For example, `optimizeExecSequence` already scans for copies to exec. Can this be improved? Notice how the scan in `optimizeExecSequence` goes backwards in the basic block and limits itself to only a small number of instruction. I could imagine a restructuring of the pass so that every basic block is scanned backwards for an EXEC-writing instruction. Depending on what the instruction is (copy-to-exec, s_and_saveexec, s_xor) one of the optimizations can be applied.
2. Why is this change done in SIOptimizeExecMasking instead of SIOptimizeExecMaskingPreRA? Actually, I don't remember why we have the two passes in the first place. Perhaps @rampitec remembers?
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:752
+ std::pair<MachineInstr *, MachineInstr *> OrXorPair{nullptr, nullptr};
+ if (MI.getOpcode() == OrSaveexecOpcode && MI != MBB.end()) {
+ const MachineOperand &OrDst = MI.getOperand(0);
----------------
MI == MBB.end() is always false here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129073/new/
https://reviews.llvm.org/D129073
More information about the llvm-commits
mailing list