[PATCH] D15690: Gather and Scatter intrinsics in the Loop Vectorizer

Tue Jan 26 05:20:31 PST 2016

delena marked 8 inline comments as done.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:2470-2493
@@ -2437,10 +2469,26 @@
+      assert(CreateGatherScatter  &&  "The instruction should be scalarized");
+      if (Gep) {
+        SmallVector<VectorParts, 4> OpsV;
+        for (Value *Op : Gep->operands()) {
+          if (PSE.getSE()->isLoopInvariant(PSE.getSCEV(Op), OrigLoop))
+            OpsV.push_back(VectorParts(UF, Op));
+          else
+            OpsV.push_back(getVectorValue(Op));
+        }
+
+        for (unsigned Part = 0; Part < UF; ++Part) {
+          SmallVector<Value*, 4> Ops;
+          for (unsigned i = 1; i < Gep->getNumOperands(); i++)
+            Ops.push_back(OpsV[i][Part]);
+          Value *GEPBasePtr = OpsV[0][Part];
+          Value *NewGep = Builder.CreateGEP(nullptr, GEPBasePtr, Ops,
+                                            "VectorGep");
+          assert(NewGep->getType()->isVectorTy() && "Expected vector GEP");
+          NewGep = Builder.CreateBitCast(NewGep,
+                                         VectorType::get(Ptr->getType(), VF));
+          VectorGep.push_back(NewGep);
+        }
+      } else
+        VectorGep = getVectorValue(Ptr);
     }
 
----------------
Ayal wrote:
> spatel wrote:
> > The function is getting too long / indented. I would prefer to see it broken up with helper functions.
> comment that in vectorizing Gep, across UF parts, we want to keep each loop-invariant base or index of Gep scalar.
May be after review. Otherwise the diff will be inconvenient for reviewers.

================
Comment at: ../test/Transforms/LoopVectorize/X86/gather_scatter.ll:20-23
@@ +19,6 @@
+;AVX512-LABEL: @foo1
+;AVX512:  llvm.masked.load
+;AVX512: llvm.masked.gather
+;AVX512: llvm.masked.store
+;AVX512: ret void
+
----------------
spatel wrote:
> Please check the full text for any masked ops that are created in this test and others. We do not want to miss any bugs/regressions resulting from changes in the data types or number of instructions produced.
I can add types. But the vector factor at this point is less than I want.
I received VF=8 instead of 16. 
The problem is in the cost model. I tried to fix the cost model, but the patch was rejected.
So, I'm adding v8f32 and hope that it will be fixed later.


http://reviews.llvm.org/D15690