[PATCH] D102830: [AMDGPU] Avoid null export insertion when unifying exit blocks

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 20 06:21:58 PDT 2021


foad added a comment.

Mostly minor comments inline but I'm concerned about the "multiple predecessors with different done exports" case.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:140
+static IntrinsicInst *findExportDone(BasicBlock *BB, ConstantInt *BoolTrue) {
+  for (auto I = BB->rbegin(), E = BB->rend(); I != E; ++I) {
+    if (IntrinsicInst *Intrin = llvm::dyn_cast<IntrinsicInst>(&*I)) {
----------------
Could be "for (auto &I : reverse(BB)) ..."


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:143
+      if (Intrin->getIntrinsicID() == Intrinsic::amdgcn_exp) {
+        if (Intrin->getArgOperand(6) == BoolTrue)
+          return Intrin;
----------------
Why compare with BoolTrue, rather than examine the value? Is this just optimisation to make the test as fast as possible?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:154
 
-BasicBlock *AMDGPUUnifyDivergentExitNodes::unifyReturnBlockSet(
-    Function &F, DomTreeUpdater &DTU, ArrayRef<BasicBlock *> ReturningBlocks,
-    bool InsertExport, StringRef Name) {
-  // Otherwise, we need to insert a new basic block into the function, add a PHI
-  // nodes (if the function returns values), and convert all of the return
-  // instructions into unconditional branches.
-  BasicBlock *NewRetBlock = BasicBlock::Create(F.getContext(), Name, &F);
-  IRBuilder<> B(NewRetBlock);
+// Search a block and its predecessors to find a "done" export
+static IntrinsicInst *findExportDone(BasicBlock *BB,
----------------
Isn't it possible that different predecessors have different "done" exports? What is this function supposed to do in that case?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:158
+                                     ConstantInt *BoolTrue) {
+  SmallVector<BasicBlock *, 8> Stack;
+
----------------
Don't need the ", 8" (unless you have special knowledge that 8 really is the optimal value).


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:177
+static bool isChannelEnabled(unsigned Channel, uint64_t EnBits, bool IsCompr) {
+  return (!IsCompr && (EnBits & (1 << Channel))) ||
+         (IsCompr && (EnBits & (0x3 << (2 * Channel))));
----------------
I think `return IsCompr ? (...) : (...)` would be clearer.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:223
+    bool ComprVal = Intrin->getIntrinsicID() == Intrinsic::amdgcn_exp_compr;
+    if (Exports.size()) {
+      MatchingExports = MatchingExports && (Target == TargetVal) &&
----------------
Clearer to use .empty() than to use .size() as a boolean, here and 12 lines below.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:266
+    for (unsigned Idx = 0; Idx < Channels; ++Idx) {
+      if (isChannelEnabled(Idx, EnBits, IsCompr))
+        Phis[Idx] = B.CreatePHI(ExportType, ReturningBlocks.size(),
----------------
Adding "else Phis[Idx] = Undef" here would simplify the creation of the new export intrinsic below.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:315
+    bool UpdateExports, StringRef Name) {
+  // Otherwise, we need to insert a new basic block into the function, add a PHI
+  // nodes (if the function returns values), and convert all of the return
----------------
What does "otherwise" refer back to?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp:436
         // assume the frontend won't do, this export will have the same exec
-        // mask as the last "real" export, and therefore the valid mask will be
+        // mark as the last "real" export, and therefore the valid mask will be
         // overwritten with the same value and will still be correct. Also,
----------------
Wasn't this supposed to say "exec mask"?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102830/new/

https://reviews.llvm.org/D102830



More information about the llvm-commits mailing list