[PATCH] D101187: [MachineCSE] Prevent CSE of non-local convergent instrs

Fri Apr 23 18:22:55 PDT 2021

arsenm added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/no-cse-nonlocal-convergent-instrs.mir:53
+    %8:vgpr_32 = COPY %7
+    %9:vgpr_32 = DS_SWIZZLE_B32 %8, 100, 0, implicit $exec
+    %10:vgpr_32, %21:sreg_32 = V_ADD_CO_U32_e64 %9, %5, 0, implicit $exec
----------------
dsanders wrote:
> It's been pointed out to me off-list that CSE'ing to here isn't actually banned by isConvergent, it's just one of the cases we conservatively decline to CSE in the change. To be covered by isConvergent it'd have to be CSE'd into a more/differently predicated block (less is ok). Furthermore the other the cases where we wouldn't be conservative are already prevented by other checks in CSE. If we can find the field we actually mean this patch will only need a small change. I haven't been able to find it though, it doesn't seem to exist in the backend and that's probably what's gotten me confused (I don't think this is the first time either :-))
> 
> That actually reminded me of something else to double check: Does this CSE without the change too?
The definition of convergent is pretty broken. For AMDGPU in the MIR the control flow as represented by basic blocks no longer expresses the lane level CFG which we're concerned with for convergent ops. In the future when we have convergence tokens, it's not clear to me if we'll somehow preserve those through codegen. It's best to just not CSE convergent operations. It's really unlikely it would be worthwhile if it's even legal

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101187/new/

https://reviews.llvm.org/D101187