[PATCH] D147408: [AMDGPU] Iterative scan implementation for atomic optimizer.

Mon May 29 19:21:44 PDT 2023

ruiling added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:752
+  if (ValDivergent && ScanImpl == ScanOptions::Iterative) {
+    Instruction *Terminator = EntryBB->getTerminator();
+    B.SetInsertPoint(ComputeEnd);
----------------
I think you also need to update Dominator tree through `DTU` as we are inserting two extra blocks. And the branch setup process sounds messy. We insert a branch to `ComputeLoop` in the middle of the entry block. And then we further split the entry block for inserting the single-lane block. It might be more clear to first split the entry block before inserting the branch to `ComputeLoop`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147408/new/

https://reviews.llvm.org/D147408