[PATCH] D112464: [x86] limit vector increment fold to allow load folding

Mon Oct 25 13:18:07 PDT 2021

spatel added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:901
+      return X86::mayFoldLoad(N->getOperand(0)) && N->getOpcode() == ISD::ADD &&
+             Subtarget->hasAVX() && !N->getOperand(1).hasOneUse();
+    };
----------------
lebedev.ri wrote:
> I'm stuck on this one-use check. Is it really reasonable?
> We don't really know that the constant is used nearby, do we?
> There doesn't seem to be a test for it.
> https://godbolt.org/z/Ea9sad3Ya
Right, we don't know exactly where the other use of that constant is or even if it's part of another increment op with this check, so it's a (weak) heuristic.

I'll add a test like you've shown. Without that check, we would alter codegen on it as shown below. It's a close call, but I think it's a regression to increase the load uops on that -- even if it is one less macro instruction.

```

diff --git a/llvm/test/CodeGen/X86/combine-sub.ll b/llvm/test/CodeGen/X86/combine-sub.ll
index a399c5175dd6..5090895c0ab8 100644
--- a/llvm/test/CodeGen/X86/combine-sub.ll
+++ b/llvm/test/CodeGen/X86/combine-sub.ll
@@ -290,9 +290,8 @@ define void @PR52032_oneuse_constant(<8 x i32>* %p) {
 ;
 ; AVX-LABEL: PR52032_oneuse_constant:
 ; AVX:       # %bb.0:
-; AVX-NEXT:    vmovdqu (%rdi), %ymm0
-; AVX-NEXT:    vpcmpeqd %ymm1, %ymm1, %ymm1
-; AVX-NEXT:    vpsubd %ymm1, %ymm0, %ymm0
+; AVX-NEXT:    vpbroadcastd {{.*#+}} ymm0 = [1,1,1,1,1,1,1,1]
+; AVX-NEXT:    vpaddd (%rdi), %ymm0, %ymm0
 ; AVX-NEXT:    vmovdqu %ymm0, (%rdi)
 ; AVX-NEXT:    vzeroupper
 ; AVX-NEXT:    retq

```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112464/new/

https://reviews.llvm.org/D112464