[PATCH] D101187: [MachineCSE] Prevent CSE of non-local convergent instrs

Fri Apr 23 18:17:33 PDT 2021

dsanders requested changes to this revision.
dsanders added inline comments.
This revision now requires changes to proceed.

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/no-cse-nonlocal-convergent-instrs.mir:53
+    %8:vgpr_32 = COPY %7
+    %9:vgpr_32 = DS_SWIZZLE_B32 %8, 100, 0, implicit $exec
+    %10:vgpr_32, %21:sreg_32 = V_ADD_CO_U32_e64 %9, %5, 0, implicit $exec
----------------
It's been pointed out to me off-list that CSE'ing to here isn't actually banned by isConvergent, it's just one of the cases we conservatively decline to CSE in the change. To be covered by isConvergent it'd have to be CSE'd into a more/differently predicated block (less is ok). Furthermore the other the cases where we wouldn't be conservative are already prevented by other checks in CSE. If we can find the field we actually mean this patch will only need a small change. I haven't been able to find it though, it doesn't seem to exist in the backend and that's probably what's gotten me confused (I don't think this is the first time either :-))

That actually reminded me of something else to double check: Does this CSE without the change too?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101187/new/

https://reviews.llvm.org/D101187