[PATCH] D102830: [AMDGPU] Avoid null export insertion when unifying exit blocks

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 21 02:04:56 PDT 2021


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:154
 
-BasicBlock *AMDGPUUnifyDivergentExitNodes::unifyReturnBlockSet(
-    Function &F, DomTreeUpdater &DTU, ArrayRef<BasicBlock *> ReturningBlocks,
-    bool InsertExport, StringRef Name) {
-  // Otherwise, we need to insert a new basic block into the function, add a PHI
-  // nodes (if the function returns values), and convert all of the return
-  // instructions into unconditional branches.
-  BasicBlock *NewRetBlock = BasicBlock::Create(F.getContext(), Name, &F);
-  IRBuilder<> B(NewRetBlock);
+// Search a block and its predecessors to find a "done" export
+static IntrinsicInst *findExportDone(BasicBlock *BB,
----------------
critson wrote:
> foad wrote:
> > Isn't it possible that different predecessors have different "done" exports? What is this function supposed to do in that case?
> This does assumes the IR is well-formed to a certain degree.
> I have added a comment to try to document this.
> 
> Consider the following IR:
> ```if (condition) {
>  export done 1
> } else {
>  export done 2
> }
> return```
> 
> This will eventually be compiled to:
> ```set exec mask for if-branch
> export done 1
> set exec mask for else-branch
> export done 2
> restore exec mask
> return```
> 
> Multiple export done instructions will be executed which is invalid.
> The expected cases for this are:
> 
> 1. Divergent exits with their own exports, e.g.
> ```if (condition) {
>   export done 1
>   return
> } else {
>   export done 2
>   return
> }```
> 
> 2. Or divergent exits without exports or a uniformly reached export, e.g.
> ```export done
> if (condition) {
>   return
> } else {
>   return
> }```
OK. Thanks for explaining. I find it hard to get my head round the possible cases that we do or don't have to handle. Do you cope with:
```
if (uniform condition)
  export done 1
else
  export done 2
if (divergent condition)
  return 1
else
  return 2
```
?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:266
+    for (unsigned Idx = 0; Idx < Channels; ++Idx) {
+      if (isChannelEnabled(Idx, EnBits, IsCompr))
+        Phis[Idx] = B.CreatePHI(ExportType, ReturningBlocks.size(),
----------------
critson wrote:
> foad wrote:
> > Adding "else Phis[Idx] = Undef" here would simplify the creation of the new export intrinsic below.
> Sure, I had to change the type of Phis and add some casts.
Sorry, I didn't realise it would need a cast.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102830/new/

https://reviews.llvm.org/D102830



More information about the llvm-commits mailing list