<div dir="ltr">Sure, I'll take a look, thanks!<div><br></div><div>Michael</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 3, 2016 at 4:12 AM, Arnaud De Grandmaison <span dir="ltr"><<a href="mailto:Arnaud.DeGrandmaison@arm.com" target="_blank">Arnaud.DeGrandmaison@arm.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Michael,<br>

<br>

We have seen significant performance regressions on some widely used industry benchmark.<br>

<br>

I've filed <a href="https://llvm.org/bugs/show_bug.cgi?id=27988" rel="noreferrer" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=27988</a> with a reproducer.<br>

<br>

As described in the Bugzilla ticket, the generated code looks not great at all on x86_64 or aarch64: none of the backends are able to do a proper job with the IV.<br>

<br>

Could you have a look into it ?<br>

<br>

Kind regards,<br>

Arnaud<br>

<div><div class="h5"><br>

> -----Original Message-----<br>

> From: llvm-commits [mailto:<a href="mailto:llvm-commits-bounces@lists.llvm.org">llvm-commits-bounces@lists.llvm.org</a>] On Behalf<br>

> Of Michael Kuperstein via llvm-commits<br>

> Sent: 01 June 2016 19:17<br>

> To: <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>

> Subject: [llvm] r271410 - [LV] For some IVs, use vector phis instead of<br>

> widening in the loop body<br>

><br>

> Author: mkuper<br>

> Date: Wed Jun  1 12:16:46 2016<br>

> New Revision: 271410<br>

><br>

> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=271410&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=271410&view=rev</a><br>

> Log:<br>

> [LV] For some IVs, use vector phis instead of widening in the loop body<br>

><br>

> Previously, whenever we needed a vector IV, we would create it on the fly,<br>

> by splatting the scalar IV and adding a step vector. Instead, we can create a<br>

> real vector IV. This tends to save a couple of instructions per iteration.<br>

><br>

> This only changes the behavior for the most basic case - integer primary IVs<br>

> with a constant step.<br>

><br>

> Differential Revision: <a href="http://reviews.llvm.org/D20315" rel="noreferrer" target="_blank">http://reviews.llvm.org/D20315</a><br>

><br>

> Modified:<br>

>     llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp<br>

>     llvm/trunk/test/Transforms/LoopVectorize/PowerPC/vsx-tsvc-s173.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/X86/gather_scatter.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/cast-induction.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/gcc-examples.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/gep_with_bitcast.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/global_alias.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/induction.ll<br>

>     llvm/trunk/test/Transforms/LoopVectorize/induction_plus.ll<br>

><br>

> Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=271410<br>

> &r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original)<br>

> +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Wed Jun  1<br>

</div></div>> +++ 12:16:46 2016<br>

<span class="">> @@ -422,6 +422,14 @@ protected:<br>

>    /// from SCEV or creates a new using SCEVExpander.<br>

>    virtual Value *getStepVector(Value *Val, int StartIdx, const SCEV *Step);<br>

><br>

> +  /// Create a vector induction variable based on an existing scalar one.<br>

> +  /// Currently only works for integer primary induction variables with<br>

> + /// a constant step.<br>

> +  /// If TruncType is provided, instead of widening the original IV, we<br>

> + /// widen a version of the IV truncated to TruncType.<br>

> +  void widenInductionVariable(const InductionDescriptor &II, VectorParts<br>

> &Entry,<br>

> +                              IntegerType *TruncType = nullptr);<br>

> +<br>

>    /// When we go over instructions in the basic block we rely on previous<br>

>    /// values within the current basic block or on loop invariant values.<br>

>    /// When we widen (vectorize) values we place them in the map. If the<br>

> values @@ -2099,6 +2107,40 @@ Value *InnerLoopVectorizer::getStepVecto<br>

>    return getStepVector(Val, StartIdx, StepValue);  }<br>

><br>

> +void InnerLoopVectorizer::widenInductionVariable(const<br>

> InductionDescriptor &II,<br>

> +                                                 VectorParts &Entry,<br>

> +                                                 IntegerType<br>

</span>> +*TruncType) {<br>

<span class="">> +  Value *Start = II.getStartValue();<br>

> +  ConstantInt *Step = II.getConstIntStepValue();<br>

> +  assert(Step && "Can not widen an IV with a non-constant step");<br>

> +<br>

> +  // Construct the initial value of the vector IV in the vector loop<br>

</span>> + preheader  auto CurrIP = Builder.saveIP();<br>

<span class="">> + Builder.SetInsertPoint(LoopVectorPreHeader->getTerminator());<br>

> +  if (TruncType) {<br>

> +    Step = ConstantInt::getSigned(TruncType, Step->getSExtValue());<br>

> +    Start = Builder.CreateCast(Instruction::Trunc, Start, TruncType);<br>

</span>> + }  Value *SplatStart = Builder.CreateVectorSplat(VF, Start);  Value<br>

> + *SteppedStart = getStepVector(SplatStart, 0, Step);<br>

<span class="">> + Builder.restoreIP(CurrIP);<br>

> +<br>

> +  Value *SplatVF =<br>

> +      ConstantVector::getSplat(VF, ConstantInt::get(Start->getType(),<br>

</span>> + VF));  // We may need to add the step a number of times, depending on<br>

> + the unroll  // factor. The last of those goes into the PHI.<br>

<span class="">> +  PHINode *VecInd = PHINode::Create(SteppedStart->getType(), 2,<br>

> "vec.ind",<br>

> +<br>

> + &*LoopVectorBody->getFirstInsertionPt());<br>

> +  Value *LastInduction = VecInd;<br>

> +  for (unsigned Part = 0; Part < UF; ++Part) {<br>

> +    Entry[Part] = LastInduction;<br>

> +    LastInduction = Builder.CreateAdd(LastInduction, SplatVF,<br>

</span>> + "step.add");  }<br>

<span class="">> +<br>

> +  VecInd->addIncoming(SteppedStart, LoopVectorPreHeader);<br>

> +  VecInd->addIncoming(LastInduction, LoopVectorBody); }<br>

> +<br>

</span><div><div class="h5">>  Value *InnerLoopVectorizer::getStepVector(Value *Val, int StartIdx,<br>

>                                            Value *Step) {<br>

>    assert(Val->getType()->isVectorTy() && "Must be a vector"); @@ -4056,19<br>

> +4098,25 @@ void InnerLoopVectorizer::widenPHIInstru<br>

>      llvm_unreachable("Unknown induction");<br>

>    case InductionDescriptor::IK_IntInduction: {<br>

>      assert(P->getType() == II.getStartValue()->getType() && "Types must<br>

> match");<br>

> -    // Handle other induction variables that are now based on the<br>

> -    // canonical one.<br>

> -    Value *V = Induction;<br>

> -    if (P != OldInduction) {<br>

> -      V = Builder.CreateSExtOrTrunc(Induction, P->getType());<br>

> -      V = II.transform(Builder, V, PSE.getSE(), DL);<br>

> -      V->setName("offset.idx");<br>

> -    }<br>

> -    Value *Broadcasted = getBroadcastInstrs(V);<br>

> -    // After broadcasting the induction variable we need to make the vector<br>

> -    // consecutive by adding 0, 1, 2, etc.<br>

> -    for (unsigned part = 0; part < UF; ++part)<br>

> -      Entry[part] = getStepVector(Broadcasted, VF * part, II.getStep());<br>

> +    if (P != OldInduction || VF == 1) {<br>

> +      Value *V = Induction;<br>

> +      // Handle other induction variables that are now based on the<br>

> +      // canonical one.<br>

> +      if (P != OldInduction) {<br>

> +        V = Builder.CreateSExtOrTrunc(Induction, P->getType());<br>

> +        V = II.transform(Builder, V, PSE.getSE(), DL);<br>

> +        V->setName("offset.idx");<br>

> +      }<br>

> +      Value *Broadcasted = getBroadcastInstrs(V);<br>

> +      // After broadcasting the induction variable we need to make the vector<br>

> +      // consecutive by adding 0, 1, 2, etc.<br>

> +      for (unsigned part = 0; part < UF; ++part)<br>

> +        Entry[part] = getStepVector(Broadcasted, VF * part, II.getStep());<br>

> +    } else {<br>

> +      // Instead of re-creating the vector IV by splatting the scalar IV<br>

> +      // in each iteration, we can make a new independent vector IV.<br>

> +      widenInductionVariable(II, Entry);<br>

> +    }<br>

>      return;<br>

>    }<br>

>    case InductionDescriptor::IK_PtrInduction:<br>

> @@ -4239,15 +4287,23 @@ void InnerLoopVectorizer::vectorizeBlock<br>

>        if (CI->getOperand(0) == OldInduction &&<br>

>            it->getOpcode() == Instruction::Trunc) {<br>

>          InductionDescriptor II =<br>

> -          Legal->getInductionVars()->lookup(OldInduction);<br>

> +            Legal->getInductionVars()->lookup(OldInduction);<br>

>          if (auto StepValue = II.getConstIntStepValue()) {<br>

> -          StepValue = ConstantInt::getSigned(cast<IntegerType>(CI-<br>

> >getType()),<br>

> -                                             StepValue->getSExtValue());<br>

> -          Value *ScalarCast = Builder.CreateCast(CI->getOpcode(), Induction,<br>

> -                                                 CI->getType());<br>

> -          Value *Broadcasted = getBroadcastInstrs(ScalarCast);<br>

> -          for (unsigned Part = 0; Part < UF; ++Part)<br>

> -            Entry[Part] = getStepVector(Broadcasted, VF * Part, StepValue);<br>

> +          IntegerType *TruncType = cast<IntegerType>(CI->getType());<br>

> +          if (VF == 1) {<br>

> +            StepValue =<br>

> +                ConstantInt::getSigned(TruncType, StepValue->getSExtValue());<br>

> +            Value *ScalarCast =<br>

> +                Builder.CreateCast(CI->getOpcode(), Induction, CI->getType());<br>

> +            Value *Broadcasted = getBroadcastInstrs(ScalarCast);<br>

> +            for (unsigned Part = 0; Part < UF; ++Part)<br>

> +              Entry[Part] = getStepVector(Broadcasted, VF * Part, StepValue);<br>

> +          } else {<br>

> +            // Truncating a vector induction variable on each iteration<br>

> +            // may be expensive. Instead, truncate the initial value, and create<br>

> +            // a new, truncated, vector IV based on that.<br>

> +            widenInductionVariable(II, Entry, TruncType);<br>

> +          }<br>

>            addMetadata(Entry, &*it);<br>

>            break;<br>

>          }<br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/PowerPC/vsx-tsvc-<br>

> s173.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/PowerPC/vsx-tsvc-<br>

> s173.ll?rev=271410&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/PowerPC/vsx-tsvc-s173.ll<br>

> (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/PowerPC/vsx-tsvc-s173.ll<br>

</div></div>> +++ Wed Jun  1 12:16:46 2016<br>

<span class="">> @@ -43,7 +43,7 @@ for.end12:<br>

><br>

>  ; CHECK-LABEL: @s173<br>

>  ; CHECK: load <4 x float>, <4 x float>* -; CHECK: add i64 %index, 16000<br>

> +; CHECK: add nsw i64 %.lhs, 16000<br>

>  ; CHECK: ret i32 0<br>

>  }<br>

><br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/X86/gather_scatter.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/X86/gather_scatter.ll?re<br>

> v=271410&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/X86/gather_scatter.ll<br>

> (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/X86/gather_scatter.ll Wed<br>

</span>> +++ Jun  1 12:16:46 2016<br>

<span class="">> @@ -95,7 +95,7 @@ for.end:<br>

>  %struct.In = type { float, float }<br>

><br>

>  ;AVX512-LABEL: @foo2<br>

> -;AVX512: getelementptr %struct.In, %struct.In* %in, <16 x i64> %induction,<br>

> i32 1<br>

> +;AVX512: getelementptr %struct.In, %struct.In* %in, <16 x i64> %{{.*}},<br>

</span>> +i32 1<br>

<span class="">>  ;AVX512: llvm.masked.gather.v16f32<br>

>  ;AVX512: llvm.masked.store.v16f32<br>

>  ;AVX512: ret void<br>

> @@ -170,10 +170,10 @@ for.end:<br>

>  ;}<br>

><br>

>  ;AVX512-LABEL: @foo3<br>

> -;AVX512: getelementptr %struct.In, %struct.In* %in, <16 x i64> %induction,<br>

> i32 1<br>

> +;AVX512: getelementptr %struct.In, %struct.In* %in, <16 x i64> %{{.*}},<br>

</span>> +i32 1<br>

<span class="">>  ;AVX512: llvm.masked.gather.v16f32<br>

>  ;AVX512: fadd <16 x float><br>

> -;AVX512: getelementptr %struct.Out, %struct.Out* %out, <16 x i64><br>

> %induction, i32 1<br>

> +;AVX512: getelementptr %struct.Out, %struct.Out* %out, <16 x i64><br>

</span>> +%{{.*}}, i32 1<br>

<span class="">>  ;AVX512: llvm.masked.scatter.v16f32<br>

>  ;AVX512: ret void<br>

><br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/cast-induction.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/cast-<br>

> induction.ll?rev=271410&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/cast-induction.ll (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/cast-induction.ll Wed Jun<br>

</span>> +++ 1 12:16:46 2016<br>

<span class="">> @@ -8,7 +8,7 @@ target triple = "x86_64-apple-macosx10.8  @a = common<br>

> global [2048 x i32] zeroinitializer, align 16<br>

><br>

>  ;CHECK-LABEL: @example12(<br>

> -;CHECK: trunc i64<br>

> +;CHECK: %vec.ind1 = phi <4 x i32><br>

>  ;CHECK: store <4 x i32><br>

>  ;CHECK: ret void<br>

>  define void @example12() nounwind uwtable ssp {<br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/gcc-examples.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/gcc-<br>

> examples.ll?rev=271410&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/gcc-examples.ll (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/gcc-examples.ll Wed Jun  1<br>

</span>> +++ 12:16:46 2016<br>

<span class="">> @@ -368,7 +368,7 @@ define void @example11() nounwind uwtabl  }<br>

><br>

>  ;CHECK-LABEL: @example12(<br>

> -;CHECK: trunc i64<br>

> +;CHECK: %vec.ind1 = phi <4 x i32><br>

>  ;CHECK: store <4 x i32><br>

>  ;CHECK: ret void<br>

>  define void @example12() nounwind uwtable ssp {<br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/gep_with_bitcast.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/gep_with_bitcast.ll?rev=<br>

> 271410&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/gep_with_bitcast.ll (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/gep_with_bitcast.ll Wed Jun<br>

</span>> +++ 1 12:16:46 2016<br>

<span class="">> @@ -12,10 +12,11 @@ target datalayout = "e-m:e-i64:64-i128:1<br>

><br>

>  ; CHECK-LABEL: @foo<br>

>  ; CHECK: vector.body<br>

> -; CHECK:  %0 = getelementptr inbounds double*, double** %in, i64 %index -<br>

> ; CHECK:  %1 = bitcast double** %0 to <4 x i64>* -; CHECK:  %wide.load = load<br>

> <4 x i64>, <4 x i64>* %1, align 8 -; CHECK:  %2 = icmp eq <4 x i64> %wide.load,<br>

> zeroinitializer<br>

> +; CHECK:  %0 = phi<br>

</span>> +; CHECK:  %2 = getelementptr inbounds double*, double** %in, i64 %0 ;<br>

> +CHECK:  %3 = bitcast double** %2 to <4 x i64>* ; CHECK:  %wide.load =<br>

> +load <4 x i64>, <4 x i64>* %3, align 8 ; CHECK:  %4 = icmp eq <4 x i64><br>

> +%wide.load, zeroinitializer<br>

<span class="">>  ; CHECK:  br i1<br>

><br>

>  define void @foo(double** noalias nocapture readonly %in, double**<br>

> noalias nocapture readnone %out, i8* noalias nocapture %res) #0 { @@ -37,4<br>

> +38,4 @@ for.body:<br>

><br>

>  for.end:<br>

>    ret void<br>

> -}<br>

> \ No newline at end of file<br>

> +}<br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/global_alias.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/global_alias.ll?rev=27141<br>

> 0&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/global_alias.ll (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/global_alias.ll Wed Jun  1<br>

</span>> +++ 12:16:46 2016<br>

<span class="">> @@ -12,7 +12,7 @@ target datalayout = "e-p:32:32:32-i1:8:8  @PA = external<br>

> global i32*<br>

><br>

><br>

> -;; === First, the tests that should always vectorize, wither statically or by<br>

> adding run-time checks ===<br>

> +;; === First, the tests that should always vectorize, whether<br>

</span>> +statically or by adding run-time checks ===<br>

<div><div class="h5">><br>

><br>

>  ; /// Different objects, positive induction, constant distance @@ -387,7<br>

> +387,7 @@ for.end:<br>

>  ;   return Foo.A[a];<br>

>  ; }<br>

>  ; CHECK-LABEL: define i32 @noAlias08(<br>

> -; CHECK: sub <4 x i32><br>

> +; CHECK: sub nuw nsw <4 x i32><br>

>  ; CHECK: ret<br>

><br>

>  define i32 @noAlias08(i32 %a) #0 {<br>

> @@ -439,7 +439,7 @@ for.end:<br>

>  ;   return Foo.A[a];<br>

>  ; }<br>

>  ; CHECK-LABEL: define i32 @noAlias09(<br>

> -; CHECK: sub <4 x i32><br>

> +; CHECK: sub nuw nsw <4 x i32><br>

>  ; CHECK: ret<br>

><br>

>  define i32 @noAlias09(i32 %a) #0 {<br>

> @@ -721,7 +721,7 @@ for.end:<br>

>  ;   return Foo.A[a];<br>

>  ; }<br>

>  ; CHECK-LABEL: define i32 @noAlias14(<br>

> -; CHECK: sub <4 x i32><br>

> +; CHECK: sub nuw nsw <4 x i32><br>

>  ; CHECK: ret<br>

><br>

>  define i32 @noAlias14(i32 %a) #0 {<br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/induction.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/induction.ll?rev=271410<br>

> &r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/induction.ll (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/induction.ll Wed Jun  1<br>

</div></div>> +++ 12:16:46 2016<br>

<span class="">> @@ -1,4 +1,6 @@<br>

>  ; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-<br>

> width=2 -S | FileCheck %s<br>

> +; RUN: opt < %s -loop-vectorize -force-vector-interleave=1<br>

</span>> +-force-vector-width=2 -instcombine -S | FileCheck %s --check-prefix=IND<br>

<span class="">> +; RUN: opt < %s -loop-vectorize -force-vector-interleave=2<br>

</span>> +-force-vector-width=2 -instcombine -S | FileCheck %s<br>

> +--check-prefix=UNROLL<br>

<span class="">><br>

>  target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-<br>

> f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-<br>

> n8:16:32:64-S128"<br>

><br>

> @@ -27,8 +29,6 @@ for.end:<br>

>    ret void<br>

>  }<br>

><br>

> -; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-<br>

> width=2 -instcombine -S | FileCheck %s --check-prefix=IND<br>

> -<br>

>  ; Make sure we remove unneeded vectorization of induction variables.<br>

>  ; In order for instcombine to cleanup the vectorized induction variables that<br>

> we  ; create in the loop vectorizer we need to perform some form of<br>

> redundancy @@ -241,3 +241,64 @@ entry:<br>

>   exit:<br>

>    ret void<br>

>  }<br>

> +<br>

</span>> +; Check that we generate vectorized IVs in the pre-header ; instead of<br>

> +widening the scalar IV inside the loop, when ; we know how to do that.<br>

<span class="">> +; IND-LABEL: veciv<br>

> +; IND: vector.body:<br>

> +; IND: %index = phi i32 [ 0, %<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %index.next, %vector.body<br>

> +] ; IND: %vec.ind = phi <2 x i32> [ <i32 0, i32 1>, %<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [<br>

</span>> +%step.add, %vector.body ] ; IND: %step.add = add <2 x i32> %vec.ind,<br>

> +<i32 2, i32 2> ; IND: %index.next = add i32 %index, 2 ; IND:<br>

> +%[[CMP:.*]] = icmp eq i32 %index.next ; IND: br i1 %[[CMP]] ;<br>

> +UNROLL-LABEL: veciv ; UNROLL: vector.body:<br>

<span class="">> +; UNROLL: %index = phi i32 [ 0, %<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %index.next,<br>

</span>> +%vector.body ] ; UNROLL: %vec.ind = phi <2 x i32> [ <i32 0, i32 1>,<br>

> +%<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %step.add1, %vector.body ] ; UNROLL: %step.add = add <2<br>

> +x i32> %vec.ind, <i32 2, i32 2> ; UNROLL: %step.add1 = add <2 x i32><br>

> +%vec.ind, <i32 4, i32 4> ; UNROLL: %index.next = add i32 %index, 4 ;<br>

> +UNROLL: %[[CMP:.*]] = icmp eq i32 %index.next ; UNROLL: br i1 %[[CMP]]<br>

<span class="">> +define void @veciv(i32* nocapture %a, i32 %start, i32 %k) {<br>

> +for.body.preheader:<br>

> +  br label %for.body<br>

> +<br>

> +for.body:<br>

> +  %indvars.iv = phi i32 [ %indvars.iv.next, %for.body ], [ 0,<br>

</span>> +%for.body.preheader ]<br>

<span class="">> +  %arrayidx = getelementptr inbounds i32, i32* %a, i32 %indvars.iv<br>

> +  store i32 %indvars.iv, i32* %arrayidx, align 4<br>

> +  %indvars.iv.next = add nuw nsw i32 %indvars.iv, 1<br>

> +  %exitcond = icmp eq i32 %indvars.iv.next, %k<br>

> +  br i1 %exitcond, label %exit, label %for.body<br>

> +<br>

> +exit:<br>

> +  ret void<br>

> +}<br>

> +<br>

> +; IND-LABEL: trunciv<br>

> +; IND: vector.body:<br>

> +; IND: %index = phi i64 [ 0, %<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %index.next, %vector.body<br>

> +] ; IND: %[[VECIND:.*]] = phi <2 x i32> [ <i32 0, i32 1>, %<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ],<br>

</span>> +[ %[[STEPADD:.*]], %vector.body ] ; IND: %[[STEPADD]] = add <2 x i32><br>

> +%[[VECIND]], <i32 2, i32 2> ; IND: %index.next = add i64 %index, 2 ;<br>

> +IND: %[[CMP:.*]] = icmp eq i64 %index.next ; IND: br i1 %[[CMP]] define<br>

> +void @trunciv(i32* nocapture %a, i32 %start, i64 %k) {<br>

<span class="">> +for.body.preheader:<br>

> +  br label %for.body<br>

> +<br>

> +for.body:<br>

> +  %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0,<br>

</span>> +%for.body.preheader ]<br>

<span class="">> +  %trunc.iv = trunc i64 %indvars.iv to i32<br>

> +  %arrayidx = getelementptr inbounds i32, i32* %a, i32 %trunc.iv<br>

> +  store i32 %trunc.iv, i32* %arrayidx, align 4<br>

> +  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1<br>

> +  %exitcond = icmp eq i64 %indvars.iv.next, %k<br>

> +  br i1 %exitcond, label %exit, label %for.body<br>

> +<br>

> +exit:<br>

> +  ret void<br>

> +}<br>

><br>

> Modified: llvm/trunk/test/Transforms/LoopVectorize/induction_plus.ll<br>

> URL: <a href="http://llvm.org/viewvc/llvm-" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-</a><br>

> project/llvm/trunk/test/Transforms/LoopVectorize/induction_plus.ll?rev=27<br>

> 1410&r1=271409&r2=271410&view=diff<br>

> ==========================================================<br>

> ====================<br>

> --- llvm/trunk/test/Transforms/LoopVectorize/induction_plus.ll (original)<br>

> +++ llvm/trunk/test/Transforms/LoopVectorize/induction_plus.ll Wed Jun<br>

</span>> +++ 1 12:16:46 2016<br>

<span class="">> @@ -1,4 +1,4 @@<br>

> -; RUN: opt < %s -loop-vectorize -force-vector-interleave=1 -force-vector-<br>

> width=4 -instcombine -S | FileCheck %s<br>

> +; RUN: opt < %s -loop-vectorize -force-vector-interleave=1<br>

</span>> +-force-vector-width=4 -S | FileCheck %s<br>

<span class="">><br>

>  target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-<br>

</span><span class="">> f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-<br>

</span><span class="">> n8:16:32:64-S128"<br>

>  target triple = "x86_64-apple-macosx10.8.0"<br>

> @@ -6,8 +6,11 @@ target triple = "x86_64-apple-macosx10.8  @array =<br>

> common global [1024 x i32] zeroinitializer, align 16<br>

><br>

>  ;CHECK-LABEL: @array_at_plus_one(<br>

> -;CHECK: add i64 %index, 12<br>

> -;CHECK: trunc i64<br>

> +;CHECK: %index = phi i64 [ 0, %<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %index.next, %vector.body<br>

</span>> +]<br>

<span class="">> +;CHECK: %vec.ind = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>,<br>

</span>> +%<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %step.add, %vector.body ]<br>

<span class="">> +;CHECK: %vec.ind1 = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>,<br>

</span>> +%<a href="http://vector.ph" rel="noreferrer" target="_blank">vector.ph</a> ], [ %step.add2, %vector.body ]<br>

<div class="HOEnZb"><div class="h5">> +;CHECK: add <4 x i64> %vec.ind, <i64 4, i64 4, i64 4, i64 4><br>

> +;CHECK: add nsw <4 x i64> %vec.ind, <i64 12, i64 12, i64 12, i64 12><br>

>  ;CHECK: ret i32<br>

>  define i32 @array_at_plus_one(i32 %n) nounwind uwtable ssp {<br>

>    %1 = icmp sgt i32 %n, 0<br>

><br>

><br>

> _______________________________________________<br>

> llvm-commits mailing list<br>

> <a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>

</div></div></blockquote></div><br></div>