[PATCH] D21620: [LV] Don't widen trivial induction variables

Matthew Simpson via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 22 18:17:18 PDT 2016


mssimpso added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2189
@@ +2188,3 @@
+
+  // We can't create a vector with less than two elements.
+  assert(VF > 1 && "VF should be greater than one");
----------------
mkuper wrote:
> mssimpso wrote:
> > mkuper wrote:
> > > This looks a bit weird to me.Do you have a test-case where this works, but instcombine fails to simplify the regular step vector?
> > > Perhaps it would be better to fix instcombine, I can look into it.
> > > 
> > > In any case, that shouldn't block the rest of the patch - (a) not widening the phi when the uses are trivial, and (b) using the scalarized step vector instead of the regular one should be independently good, right?
> > > This looks a bit weird to me.Do you have a test-case where this works, but instcombine fails to simplify the regular step vector?
> > 
> > Sure. Here's a simple example with interleaved access vectorization enabled.
> > 
> > In the test case below, if we use the regular step vector, we end up creating unneeded splat inserts and extracts inside the loop. A scalar induction variable is definitely preferable here since we don't do anything exciting with it.
> > 
> > ```
> > 
> > ; opt -S < %s -loop-vectorize -force-vector-width=2 -force-vector-interleave=2 -enable-interleaved-mem-accesses -instcombine
> > target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
> > target triple = "aarch64--linux-gnu"
> > 
> > %pair = type { i64, i64 }
> > define void @interleaved(%pair *%p, i64 %y, i64 %n) {
> > entry:
> >   br label %for.body
> > 
> > for.body:
> >   %i  = phi i64 [ %i.next, %for.body ], [ 0, %entry ]
> >   %f1 = getelementptr inbounds %pair, %pair* %p, i64 %i, i32 1
> >   %r0 = load i64, i64* %f1, align 8
> >   %r3 = xor i64 %r0, %y
> >   store i64 %r3, i64* %f1, align 8
> >   %i.next = add nuw nsw i64 %i, 1
> >   %cond = icmp slt i64 %i.next, %n
> >   br i1 %cond, label %for.body, label %for.end
> > 
> > for.end:
> >   ret void
> > }
> > ```
> > 
> > I've seen other cases with what I've been working on that lead me to believe fixing this kind of thing in InstCombine is not going to be that fruitful. Things get very ugly when interleaved access vectorization is enabled with conditional stores vectorization, for example. I think it's better to do the right thing at the outset.
> Sorry, I wasn't very clear.
> 
> I didn't mean an example of when a scalar IV is preferable. You're completely right about that.
> I meant a case where, in addition to the scalar IV, we want the pre-scalarized step vector, because the vectorized one doesn't simplify. Or is this the same example?
No, you were clear. This is the same example -- it doesn't simplify. Without the pre-scalarized step vector, we're left with the splats, extracts, etc. I will add it as a test case to the patch.


http://reviews.llvm.org/D21620





More information about the llvm-commits mailing list