[llvm] [Transform][LoadStoreVectorizer] allow redundant in Chain (PR #163019)

Drew Kersnar via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 17 13:01:39 PDT 2025


================
@@ -964,22 +977,24 @@ bool Vectorizer::vectorizeChain(Chain &C) {
 
     // Build the vector to store.
     Value *Vec = PoisonValue::get(VecTy);
-    unsigned VecIdx = 0;
-    auto InsertElem = [&](Value *V) {
+    auto InsertElem = [&](Value *V, unsigned VecIdx) {
       if (V->getType() != VecElemTy)
         V = Builder.CreateBitOrPointerCast(V, VecElemTy);
-      Vec = Builder.CreateInsertElement(Vec, V, Builder.getInt32(VecIdx++));
+      Vec = Builder.CreateInsertElement(Vec, V, Builder.getInt32(VecIdx));
     };
     for (const ChainElem &E : C) {
       auto *I = cast<StoreInst>(E.Inst);
+      int EOffset = (E.OffsetFromLeader - C[0].OffsetFromLeader).getSExtValue();
+      int VecIdx = 8 * EOffset / DL.getTypeSizeInBits(VecElemTy);
       if (FixedVectorType *VT =
               dyn_cast<FixedVectorType>(getLoadStoreType(I))) {
         for (int J = 0, JE = VT->getNumElements(); J < JE; ++J) {
           InsertElem(Builder.CreateExtractElement(I->getValueOperand(),
-                                                  Builder.getInt32(J)));
+                                                  Builder.getInt32(J)),
+                     VecIdx++);
         }
       } else {
-        InsertElem(I->getValueOperand());
+        InsertElem(I->getValueOperand(), VecIdx);
       }
     }
----------------
dakersnar wrote:

I was trying to think through some edge cases that might cause this change to fail, but I generally think this seems correct. I want to confirm my thinking through. With this feature for stores, you could have a chain that contains two stores to the same location that both store two different values. My read of your implementation is that the stores will each have a corresponding InsertElement created for them targeting the same element in the vector, and the value of the latter of the two stores will end up as the actual value that gets stored in the vectorized store. Is that correct?

I also have an example where an <2 x i32> vector is vectorized with an i32 scalar, which this pass is capable of doing.

 store <2 x i32> <i32 -1, i32 -1>, ptr %p, align 8
 store i32 0, ptr %p, align 4
 
Is the following the output we expect with your change?

 store <2 x i32> <i32 0, i32 -1>, ptr %p, align 8



https://github.com/llvm/llvm-project/pull/163019


More information about the llvm-commits mailing list