[PATCH] D141778: [DAGCombiner][X86] `mergeConsecutiveStores()`: support merging splat-stores of the same value

Sun Jan 15 07:25:48 PST 2023

lebedev.ri added inline comments.

================
Comment at: llvm/test/CodeGen/X86/legalize-shl-vec.ll:247-252
+; X64-NEXT:    movq %r8, %xmm0
+; X64-NEXT:    pshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
+; X64-NEXT:    psrad $31, %xmm0
+; X64-NEXT:    pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3]
+; X64-NEXT:    movdqa %xmm0, 16(%rdi)
+; X64-NEXT:    movdqa %xmm0, (%rdi)
----------------
pengfei wrote:
> Looks like regression here.
We've traded 4 scalar stores to two vector stores + GPR->XMM xfer + two shuffles + shift.
I wouldn't say it's an obvious regression, since we get less contention in CPU store unit,
but it's not really an improvement, yes.

Can you help spot issues in the tests in `elementwise-store-of-scalar-splat.ll` / `subvectorwise-store-of-vector-splat.ll`?
If those do not have regressions, then we need to restrict some other fold.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141778/new/

https://reviews.llvm.org/D141778