[llvm] [AMDGPU] Change control flow intrinsic lowering making the wave to re… (PR #86805)

via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 1 00:57:02 PDT 2024


================
@@ -15619,6 +15619,91 @@ void SITargetLowering::finalizeLowering(MachineFunction &MF) const {
     }
   }
 
+  // ISel inserts copy to regs for the successor PHIs
+  // at the BB end. We need to move the SI_END_CF right before the branch.
+  // Even we don't have to move SI_END_CF we need to take care of the
+  // S_CBRANCH_SCC0/1 as SI_END_CF overwrites SCC
+  for (auto &MBB : MF) {
+    for (auto &MI : MBB) {
+      if (MI.getOpcode() == AMDGPU::SI_END_CF) {
+        MachineBasicBlock::iterator I(MI);
+        MachineBasicBlock::iterator Next = std::next(I);
+        bool NeedToMove = false;
+        while (Next != MBB.end() && !Next->isBranch()) {
+          NeedToMove = true;
+          Next++;
+        }
+
+        // Lets take care of SCC users as S_END_CF defines SCC
+        bool NeedPreserveSCC =
+            Next != MBB.end() && Next->readsRegister(AMDGPU::SCC);
+        MachineBasicBlock::iterator SCCDefUse(Next);
+        // This loop will be never taken as we always have S_CBRANCH_SCC1/0 at
+        // the end of the block.
+        while (!NeedPreserveSCC && SCCDefUse != MBB.end()) {
+          if (SCCDefUse->definesRegister(AMDGPU::SCC))
+            // This should never happen - SCC def after the branch reading SCC
+            break;
+          if (SCCDefUse->readsRegister(AMDGPU::SCC)) {
+            NeedPreserveSCC = true;
+            break;
+          }
+          SCCDefUse++;
+        }
+        if (NeedPreserveSCC) {
----------------
ruiling wrote:

The preserving scc logic is too much expensive and making things complex. Let's try to avoid this and assert this never happen. I think this happens for uniform-regions inside if-endif? Like for the case:
```
bb0
| \
|  bb1
|   |  \
|   /   bb2
\ /   /
 bb3
```
or
```
bb1
| \
|  bb2] (self loop)
| /
bb3
```
My suggestion is let's insert a dedicated block to close the edges taken the IF path. For the first case, insert one common successor for bb1&bb2 to jump to. For the second case, insert one block along the edge bb2->bb3.



https://github.com/llvm/llvm-project/pull/86805


More information about the llvm-commits mailing list