[PATCH] D64393: [AMDGPU] Fix DPP combiner check for exec modification

Wed Jul 10 13:04:23 PDT 2019

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:6088
                                       const MachineInstr &DefMI,
                                       const MachineInstr *UseMI) {
   assert(MRI.isSSA() && "Must be run on SSA");
----------------
This should now be mandatory, and a reference

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:6103-6106
+  auto E = UseMI->getIterator();
+  for (auto I = std::next(DefMI.getIterator()); I != E; ++I) {
+    if (I->isDebugInstr())
+      continue;
----------------
This should probably be a backwards scan, but that can be a separate change

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:6144-6145

-  // Stop scan at the use if known.
-  auto E = UseMI ? UseMI->getIterator() : DefBB->end();
-  for (auto I = std::next(DefMI.getIterator()); I != E; ++I) {
+  // Stop scan when we have seen all the uses.
+  for (auto I = std::next(DefMI.getIterator()); !Uses.empty(); ++I) {
     if (I->isDebugInstr())
----------------
Why is this erasing the use as it iterates?

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.h:1014
 bool execMayBeModifiedBeforeUse(const MachineRegisterInfo &MRI,
                                 unsigned VReg,
                                 const MachineInstr &DefMI,
----------------
Might as well update this to use Register

================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.h:1022
+bool execMayBeModifiedBeforeAnyUse(const MachineRegisterInfo &MRI,
+                                   unsigned VReg,
+                                   const MachineInstr &DefMI);
----------------
Should use Register

================
Comment at: llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll:203-204
+define amdgpu_kernel void @add_i32_varying_dpp() {
+  %lane = call i32 @llvm.amdgcn.workitem.id.x()
+  %old = atomicrmw add i32 addrspace(3)* @local_var32, i32 %lane acq_rel
+  ret void
----------------
foad wrote:
> arsenm wrote:
> > Can you add some cases with multiple uses independent of the atomic optimizer?
> See existing dpp_seq test in dpp_combine.mir.
They weren't catching this issue

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64393/new/

https://reviews.llvm.org/D64393