[llvm] AMDGPU: Fix temporal divergence introduced by machine-sink and performance regression introduced by D155343 (PR #67456)
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 4 12:45:34 PDT 2023
================
@@ -171,6 +171,48 @@ bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const {
isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent());
}
+bool SIInstrInfo::isSafeToSink(MachineInstr &MI,
+ MachineBasicBlock *SuccToSinkTo,
+ MachineCycleInfo *CI) const {
+ // Allow sinking if MI edits lane mask (divergent i1 in sgpr).
+ if (MI.getOpcode() == AMDGPU::SI_IF_BREAK)
+ return true;
+
+ MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
+ // Check if sinking of MI would create temporal divergent use.
+ for (auto Op : MI.uses()) {
+ if (Op.isReg() && Op.getReg().isVirtual() &&
+ RI.isSGPRClass(MRI.getRegClass(Op.getReg()))) {
+ MachineInstr *SgprDef = MRI.getVRegDef(Op.getReg());
+
+ // SgprDef defined inside cycle
+ MachineCycle *FromCycle = CI->getCycle(SgprDef->getParent());
+ if (FromCycle == nullptr)
+ continue;
+
+ MachineCycle *ToCycle = CI->getCycle(SuccToSinkTo);
+ // Check if there is a FromCycle that contains SgprDef's basic block but
+ // does not contain SuccToSinkTo and also has divergent exit condition.
+ while (FromCycle != ToCycle) {
----------------
nhaehnle wrote:
This needs to be `FromCycle && !FromCycle->contains(ToCycle)`.
For the reason, consider the case where there are two cycles right after each other:
```
|
v/-\
A |
|\-/
|
|
v/-\
B |
|\-/
|
```
It could be the case that SgprDef is in A and SuccToSinkTo is B.
https://github.com/llvm/llvm-project/pull/67456
More information about the llvm-commits
mailing list