[PATCH] D30710: [LV] Vectorize GEPs

Matthew Simpson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 20 04:40:21 PDT 2017

mssimpso added a comment.

In https://reviews.llvm.org/D30710#705027, @mkuper wrote:

> Is there a test that actually stores a vector of pointers?

I'll add one and update the patch. Thanks, Michael!

Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4710
+      if (Legal->isUniform(GEP)) {
+        auto *Clone = cast<GetElementPtrInst>(GEP->clone());
+        for (unsigned I = 0; I < GEP->getNumOperands(); ++I)
delena wrote:
> mssimpso wrote:
> > mssimpso wrote:
> > > delena wrote:
> > > > GEP is uniform when the memory instruction (User) is uniform, right?
> > > > Why do you need to broadcast it?
> > > This is "loop-invariant" in the LoopAccessInfo::isUniform sense not in the LoopVectorizationCostModel::isUniformAfterVectorization sense (we really should come up with some better names for these concepts). Sometimes we end up with GEPs contained in the loop body that have loop-invariant operands. I'm not sure why these GEPs aren't hoisted out of the loop before the vectorizer runs.
> > > 
> > > In theory, we should be able to hoist these GEPs out of the loop ourselves, but we have assumptions elsewhere that if an instruction existed in the original loop body, it will map to something inside the vectorized loop body. So I just clone and broadcast the original GEP inside the loop here. The change to the first-order-recurrence.ll test case is reflective of this.
> > I just thought about a different way to implement this. Instead of the uniform check, we could check if the value returned by the IRBuilder has a vector type, and if not, do the broadcast. If all the operands are loop-invariant in the code below, the IRBuilder will return a scalar GEP. This will probably be fewer lines of code anyway. Let me give it a shot.
> Do you have any real case with loop invariant GEP? Do you mean the last test case from first-order-recurrence.ll:
>   define void @PR29559() {
>   entry:
>     br label %scalar.body
>   scalar.body:
>     %i = phi i64 [ 0, %entry ], [ %i.next, %scalar.body ]
>     %tmp2 = phi float* [ undef, %entry ], [ %tmp3, %scalar.body ]
>     %tmp3 = getelementptr inbounds [3 x float], [3 x float]* undef, i64 0, i64 0
>     %i.next = add nuw nsw i64 %i, 1
>     %cond = icmp eq i64 %i.next, undef
>     br i1 %cond, label %for.end, label %scalar.body
>   for.end:
>     ret void
>   }
> The %tmp2 and %tmp3 should be scalar.
> In my understanding, if all operands of the GEP are loop invariant, the Load/Store is uniform. I just do not understand in which cases we'll     need to broadcast the GEP.
I should add a test for the broadcast case to make this clear, shouldn't I? If we have a GEP like %tmp3 in the example you pasted that was stored to memory with a vector store, we would need a vector version of it, and it wouldn't be "uniform-after-vectorization". It's still "uniform" in the LAA sense because it has loop-invariant operands. Because if this, IRBuilder will give us a scalar GEP that we will then broadcast. I'll add the test and update.


More information about the llvm-commits mailing list