[PATCH] D102830: [AMDGPU] Avoid null export insertion when unifying exit blocks

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 20 20:52:35 PDT 2021


critson marked 9 inline comments as done.
critson added a comment.

In D102830#2770970 <https://reviews.llvm.org/D102830#2770970>, @foad wrote:

> Mostly minor comments inline but I'm concerned about the "multiple predecessors with different done exports" case.

I have gone into this a bit more in the comment.
However this expects input IR is well-formed w.r.t. exports, as it is produced by existing front-ends.
If we wanted to handle the generic case then I feel like we might as well write a pass that checks all exports and adjusts them so they are well formed and run it after this pass.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:143
+      if (Intrin->getIntrinsicID() == Intrinsic::amdgcn_exp) {
+        if (Intrin->getArgOperand(6) == BoolTrue)
+          return Intrin;
----------------
foad wrote:
> Why compare with BoolTrue, rather than examine the value? Is this just optimisation to make the test as fast as possible?
I believe this is standard practice rather than retrieving the constant int values representing true and false repeatedly.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:154
 
-BasicBlock *AMDGPUUnifyDivergentExitNodes::unifyReturnBlockSet(
-    Function &F, DomTreeUpdater &DTU, ArrayRef<BasicBlock *> ReturningBlocks,
-    bool InsertExport, StringRef Name) {
-  // Otherwise, we need to insert a new basic block into the function, add a PHI
-  // nodes (if the function returns values), and convert all of the return
-  // instructions into unconditional branches.
-  BasicBlock *NewRetBlock = BasicBlock::Create(F.getContext(), Name, &F);
-  IRBuilder<> B(NewRetBlock);
+// Search a block and its predecessors to find a "done" export
+static IntrinsicInst *findExportDone(BasicBlock *BB,
----------------
foad wrote:
> Isn't it possible that different predecessors have different "done" exports? What is this function supposed to do in that case?
This does assumes the IR is well-formed to a certain degree.
I have added a comment to try to document this.

Consider the following IR:
```if (condition) {
 export done 1
} else {
 export done 2
}
return```

This will eventually be compiled to:
```set exec mask for if-branch
export done 1
set exec mask for else-branch
export done 2
restore exec mask
return```

Multiple export done instructions will be executed which is invalid.
The expected cases for this are:

1. Divergent exits with their own exports, e.g.
```if (condition) {
  export done 1
  return
} else {
  export done 2
  return
}```

2. Or divergent exits without exports or a uniformly reached export, e.g.
```export done
if (condition) {
  return
} else {
  return
}```


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:266
+    for (unsigned Idx = 0; Idx < Channels; ++Idx) {
+      if (isChannelEnabled(Idx, EnBits, IsCompr))
+        Phis[Idx] = B.CreatePHI(ExportType, ReturningBlocks.size(),
----------------
foad wrote:
> Adding "else Phis[Idx] = Undef" here would simplify the creation of the new export intrinsic below.
Sure, I had to change the type of Phis and add some casts.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:315
+    bool UpdateExports, StringRef Name) {
+  // Otherwise, we need to insert a new basic block into the function, add a PHI
+  // nodes (if the function returns values), and convert all of the return
----------------
foad wrote:
> What does "otherwise" refer back to?
I don't know, let's remove it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102830/new/

https://reviews.llvm.org/D102830



More information about the llvm-commits mailing list