<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi Alex,</p>
    <p>Thanks for the report, hope to fix it soon<br>
    </p>
    <pre class="moz-signature" cols="72">-------------
Best regards,
Alexey Bataev</pre>
    <div class="moz-cite-prefix">08.01.2018 20:01, Alex L пишет:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAKS3GBvhobdsXyKpLFZdBzxKLiOco6TguJ8ycxeOPhPVhZPwJg@mail.gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr">Hi Alexey,
        <div><br>
        </div>
        <div>It looks like this commit caused a new regression in LLVM 6
          (reached unreachable). I filed <a
href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.llvm.org%2Fshow_bug.cgi%3Fid%3D35865&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=v9eQe7nOFVKtLxUzQUXStN0uw7mwkPZj1mx1e9Cl8dA%3D&reserved=0"
            moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=35865</a>.
          Do you mind taking a look at it?</div>
        <div><br>
        </div>
        <div>Thanks,</div>
        <div>Alex</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On 7 August 2017 at 08:25, Alexey
          Bataev via llvm-commits <span dir="ltr"><<a
              href="mailto:llvm-commits@lists.llvm.org" target="_blank"
              moz-do-not-send="true">llvm-commits@lists.llvm.org</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">Author:
            abataev<br>
            Date: Mon Aug  7 08:25:49 2017<br>
            New Revision: 310260<br>
            <br>
            URL: <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%3Frev%3D310260%26view%3Drev&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=90rTmgMHWtw5DBKvBoOBr5NhVjweoKH%2BK6xQf1NyWx8%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-<wbr>project?rev=310260&view=rev</a><br>
            Log:<br>
            [SLP] General improvements of SLP vectorization process.<br>
            <br>
            Patch tries to improve two-pass vectorization analysis,
            existing in SLP vectorizer. What it does:<br>
            <br>
            1. Defines key nodes, that are the vectorization roots.
            Previously vectorization started if StoreInst or ReturnInst
            is found. For now, the vectorization started for all
            Instructions with no users and void types (Terminators,
            StoreInst) + CallInsts.<br>
            2. CmpInsts, InsertElementInsts and InsertValueInsts are
            stored in the<br>
            array. This array is processed only after the vectorization
            of the<br>
            first-after-these instructions key node is finished.
            Vectorization goes<br>
            in reverse order to try to vectorize as much code as
            possible.<br>
            <br>
            Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon<br>
            <br>
            Subscribers: ashahid, anemet, RKSimon, mssimpso,
            llvm-commits<br>
            <br>
            Differential Revision: <a
href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD29826&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=%2FJ41vvea%2FZL1d3%2FKlqhK4DIQuP6lCLvEUjkXBV3xkjo%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://reviews.llvm.org/<wbr>D29826</a><br>
            <br>
            Modified:<br>
                llvm/trunk/include/llvm/<wbr>Transforms/Vectorize/<wbr>SLPVectorizer.h<br>
                llvm/trunk/lib/Transforms/<wbr>Vectorize/SLPVectorizer.cpp<br>
                llvm/trunk/test/Transforms/<wbr>SLPVectorizer/AArch64/gather-<wbr>root.ll<br>
                llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/horizontal.<wbr>ll<br>
                llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/insert-<wbr>element-build-vector.ll<br>
            <br>
            Modified: llvm/trunk/include/llvm/<wbr>Transforms/Vectorize/<wbr>SLPVectorizer.h<br>
            URL: <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%2Fllvm%2Ftrunk%2Finclude%2Fllvm%2FTransforms%2FVectorize%2FSLPVectorizer.h%3Frev%3D310260%26r1%3D310259%26r2%3D310260%26view%3Ddiff&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=PC%2FP25L7rO%2BvFfxj1UQr06SurcCnchJqRa1pS8Z91Sg%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/include/<wbr>llvm/Transforms/Vectorize/<wbr>SLPVectorizer.h?rev=310260&r1=<wbr>310259&r2=310260&view=diff</a><br>
            ==============================<wbr>==============================<wbr>==================<br>
            --- llvm/trunk/include/llvm/<wbr>Transforms/Vectorize/<wbr>SLPVectorizer.h
            (original)<br>
            +++ llvm/trunk/include/llvm/<wbr>Transforms/Vectorize/<wbr>SLPVectorizer.h
            Mon Aug  7 08:25:49 2017<br>
            @@ -100,6 +100,19 @@ private:<br>
                                             slpvectorizer::BoUpSLP
            &R,<br>
                                             TargetTransformInfo *TTI);<br>
            <br>
            +  /// Try to vectorize trees that start at insertvalue
            instructions.<br>
            +  bool vectorizeInsertValueInst(<wbr>InsertValueInst *IVI,
            BasicBlock *BB,<br>
            +                                slpvectorizer::BoUpSLP
            &R);<br>
            +  /// Try to vectorize trees that start at insertelement
            instructions.<br>
            +  bool vectorizeInsertElementInst(<wbr>InsertElementInst
            *IEI, BasicBlock *BB,<br>
            +                                  slpvectorizer::BoUpSLP
            &R);<br>
            +  /// Try to vectorize trees that start at compare
            instructions.<br>
            +  bool vectorizeCmpInst(CmpInst *CI, BasicBlock *BB,
            slpvectorizer::BoUpSLP &R);<br>
            +  /// Tries to vectorize constructs started from CmpInst,
            InsertValueInst or<br>
            +  /// InsertElementInst instructions.<br>
            +  bool vectorizeSimpleInstructions(<wbr>SmallVectorImpl<WeakVH>
            &Instructions,<br>
            +                                   BasicBlock *BB,
            slpvectorizer::BoUpSLP &R);<br>
            +<br>
               /// \brief Scan the basic block and look for patterns
            that are likely to start<br>
               /// a vectorization chain.<br>
               bool vectorizeChainsInBlock(<wbr>BasicBlock *BB,
            slpvectorizer::BoUpSLP &R);<br>
            <br>
            Modified: llvm/trunk/lib/Transforms/<wbr>Vectorize/SLPVectorizer.cpp<br>
            URL: <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%2Fllvm%2Ftrunk%2Flib%2FTransforms%2FVectorize%2FSLPVectorizer.cpp%3Frev%3D310260%26r1%3D310259%26r2%3D310260%26view%3Ddiff&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=CNqrpPlG7%2FVDu02BuGGZy0RxtCFjcj8f4IKASDel5fA%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/lib/<wbr>Transforms/Vectorize/<wbr>SLPVectorizer.cpp?rev=310260&<wbr>r1=310259&r2=310260&view=diff</a><br>
            ==============================<wbr>==============================<wbr>==================<br>
            --- llvm/trunk/lib/Transforms/<wbr>Vectorize/SLPVectorizer.cpp
            (original)<br>
            +++ llvm/trunk/lib/Transforms/<wbr>Vectorize/SLPVectorizer.cpp
            Mon Aug  7 08:25:49 2017<br>
            @@ -4387,7 +4387,7 @@ bool SLPVectorizerPass::<wbr>tryToVectorize(I<br>
               if (!I)<br>
                 return false;<br>
            <br>
            -  if (!isa<BinaryOperator>(I))<br>
            +  if (!isa<BinaryOperator>(I) &&
            !isa<CmpInst>(I))<br>
                 return false;<br>
            <br>
               Value *P = I->getParent();<br>
            @@ -4925,39 +4925,30 @@ private:<br>
             ///  %rb = insertelement <4 x float> %ra, float %s1,
            i32 1<br>
             ///  %rc = insertelement <4 x float> %rb, float %s2,
            i32 2<br>
             ///  %rd = insertelement <4 x float> %rc, float %s3,
            i32 3<br>
            +///  starting from the last insertelement instruction.<br>
             ///<br>
             /// Returns true if it matches<br>
             ///<br>
            -static bool findBuildVector(<wbr>InsertElementInst
            *FirstInsertElem,<br>
            +static bool findBuildVector(<wbr>InsertElementInst
            *LastInsertElem,<br>
                                         SmallVectorImpl<Value *>
            &BuildVector,<br>
                                         SmallVectorImpl<Value *>
            &BuildVectorOpds) {<br>
            -  if (!isa<UndefValue>(<wbr>FirstInsertElem->getOperand(0)<wbr>))<br>
            -    return false;<br>
            -<br>
            -  InsertElementInst *IE = FirstInsertElem;<br>
            -  while (true) {<br>
            -    BuildVector.push_back(IE);<br>
            -    BuildVectorOpds.push_back(IE-><wbr>getOperand(1));<br>
            -<br>
            -    if (IE->use_empty())<br>
            -      return false;<br>
            -<br>
            -    InsertElementInst *NextUse =
            dyn_cast<InsertElementInst>(<wbr>IE->user_back());<br>
            -    if (!NextUse)<br>
            -      return true;<br>
            -<br>
            -    // If this isn't the final use, make sure the next
            insertelement is the only<br>
            -    // use. It's OK if the final constructed vector is used
            multiple times<br>
            -    if (!IE->hasOneUse())<br>
            +  Value *V = nullptr;<br>
            +  do {<br>
            +    BuildVector.push_back(<wbr>LastInsertElem);<br>
            +    BuildVectorOpds.push_back(<wbr>LastInsertElem->getOperand(1))<wbr>;<br>
            +    V = LastInsertElem->getOperand(0);<br>
            +    if (isa<UndefValue>(V))<br>
            +      break;<br>
            +    LastInsertElem = dyn_cast<InsertElementInst>(V)<wbr>;<br>
            +    if (!LastInsertElem || !LastInsertElem->hasOneUse())<br>
                   return false;<br>
            -<br>
            -    IE = NextUse;<br>
            -  }<br>
            -<br>
            -  return false;<br>
            +  } while (true);<br>
            +  std::reverse(BuildVector.<wbr>begin(),
            BuildVector.end());<br>
            +  std::reverse(BuildVectorOpds.<wbr>begin(),
            BuildVectorOpds.end());<br>
            +  return true;<br>
             }<br>
            <br>
            -/// \brief Like findBuildVector, but looks backwards for
            construction of aggregate.<br>
            +/// \brief Like findBuildVector, but looks for construction
            of aggregate.<br>
             ///<br>
             /// \return true if it matches.<br>
             static bool findBuildAggregate(<wbr>InsertValueInst *IV,<br>
            @@ -5142,6 +5133,64 @@ bool SLPVectorizerPass::<wbr>vectorizeRootIns<br>
                                                             
             ExtraVectorization);<br>
             }<br>
            <br>
            +bool SLPVectorizerPass::<wbr>vectorizeInsertValueInst(<wbr>InsertValueInst
            *IVI,<br>
            +                                                 BasicBlock
            *BB, BoUpSLP &R) {<br>
            +  const DataLayout &DL = BB->getModule()-><wbr>getDataLayout();<br>
            +  if (!R.canMapToVector(IVI-><wbr>getType(), DL))<br>
            +    return false;<br>
            +<br>
            +  SmallVector<Value *, 16> BuildVector;<br>
            +  SmallVector<Value *, 16> BuildVectorOpds;<br>
            +  if (!findBuildAggregate(IVI, BuildVector,
            BuildVectorOpds))<br>
            +    return false;<br>
            +<br>
            +  DEBUG(dbgs() << "SLP: array mappable to vector: "
            << *IVI << "\n");<br>
            +  return tryToVectorizeList(<wbr>BuildVectorOpds, R,
            BuildVector, false);<br>
            +}<br>
            +<br>
            +bool SLPVectorizerPass::<wbr>vectorizeInsertElementInst(<wbr>InsertElementInst
            *IEI,<br>
            +                                                 
             BasicBlock *BB, BoUpSLP &R) {<br>
            +  SmallVector<Value *, 16> BuildVector;<br>
            +  SmallVector<Value *, 16> BuildVectorOpds;<br>
            +  if (!findBuildVector(IEI, BuildVector, BuildVectorOpds))<br>
            +    return false;<br>
            +<br>
            +  // Vectorize starting with the build vector operands
            ignoring the BuildVector<br>
            +  // instructions for the purpose of scheduling and user
            extraction.<br>
            +  return tryToVectorizeList(<wbr>BuildVectorOpds, R,
            BuildVector);<br>
            +}<br>
            +<br>
            +bool SLPVectorizerPass::<wbr>vectorizeCmpInst(CmpInst *CI,
            BasicBlock *BB,<br>
            +                                         BoUpSLP &R) {<br>
            +  if (tryToVectorizePair(CI-><wbr>getOperand(0),
            CI->getOperand(1), R))<br>
            +    return true;<br>
            +<br>
            +  bool OpsChanged = false;<br>
            +  for (int Idx = 0; Idx < 2; ++Idx) {<br>
            +    OpsChanged |=<br>
            +        vectorizeRootInstruction(<wbr>nullptr,
            CI->getOperand(Idx), BB, R, TTI);<br>
            +  }<br>
            +  return OpsChanged;<br>
            +}<br>
            +<br>
            +bool SLPVectorizerPass::<wbr>vectorizeSimpleInstructions(<br>
            +    SmallVectorImpl<WeakVH> &Instructions,
            BasicBlock *BB, BoUpSLP &R) {<br>
            +  bool OpsChanged = false;<br>
            +  for (auto &VH : reverse(Instructions)) {<br>
            +    auto *I = dyn_cast_or_null<Instruction>(<wbr>VH);<br>
            +    if (!I)<br>
            +      continue;<br>
            +    if (auto *LastInsertValue =
            dyn_cast<InsertValueInst>(I))<br>
            +      OpsChanged |= vectorizeInsertValueInst(<wbr>LastInsertValue,
            BB, R);<br>
            +    else if (auto *LastInsertElem =
            dyn_cast<InsertElementInst>(I)<wbr>)<br>
            +      OpsChanged |= vectorizeInsertElementInst(<wbr>LastInsertElem,
            BB, R);<br>
            +    else if (auto *CI = dyn_cast<CmpInst>(I))<br>
            +      OpsChanged |= vectorizeCmpInst(CI, BB, R);<br>
            +  }<br>
            +  Instructions.clear();<br>
            +  return OpsChanged;<br>
            +}<br>
            +<br>
             bool SLPVectorizerPass::<wbr>vectorizeChainsInBlock(<wbr>BasicBlock
            *BB, BoUpSLP &R) {<br>
               bool Changed = false;<br>
               SmallVector<Value *, 4> Incoming;<br>
            @@ -5201,10 +5250,21 @@ bool SLPVectorizerPass::<wbr>vectorizeChainsI<br>
            <br>
               VisitedInstrs.clear();<br>
            <br>
            +  SmallVector<WeakVH, 8> PostProcessInstructions;<br>
            +  SmallDenseSet<Instruction *, 4> KeyNodes;<br>
               for (BasicBlock::iterator it = BB->begin(), e =
            BB->end(); it != e; it++) {<br>
                 // We may go through BB multiple times so skip the one
            we have checked.<br>
            -    if (!VisitedInstrs.insert(&*it).<wbr>second)<br>
            +    if (!VisitedInstrs.insert(&*it).<wbr>second) {<br>
            +      if (it->use_empty() &&
            KeyNodes.count(&*it) > 0 &&<br>
            +          vectorizeSimpleInstructions(<wbr>PostProcessInstructions,
            BB, R)) {<br>
            +        // We would like to start over since some
            instructions are deleted<br>
            +        // and the iterator may become invalid value.<br>
            +        Changed = true;<br>
            +        it = BB->begin();<br>
            +        e = BB->end();<br>
            +      }<br>
                   continue;<br>
            +    }<br>
            <br>
                 if (isa<DbgInfoIntrinsic>(it))<br>
                   continue;<br>
            @@ -5226,96 +5286,37 @@ bool SLPVectorizerPass::<wbr>vectorizeChainsI<br>
                   continue;<br>
                 }<br>
            <br>
            -    if (<wbr>ShouldStartVectorizeHorAtStore<wbr>) {<br>
            -      if (StoreInst *SI = dyn_cast<StoreInst>(it)) {<br>
            -        // Try to match and vectorize a horizontal
            reduction.<br>
            -        if (vectorizeRootInstruction(<wbr>nullptr,
            SI->getValueOperand(), BB, R,<br>
            -                                     TTI)) {<br>
            -          Changed = true;<br>
            -          it = BB->begin();<br>
            -          e = BB->end();<br>
            -          continue;<br>
            +    // Ran into an instruction without users, like
            terminator, or function call<br>
            +    // with ignored return value, store. Ignore unused
            instructions (basing on<br>
            +    // instruction type, except for CallInst and
            InvokeInst).<br>
            +    if (it->use_empty() &&
            (it->getType()->isVoidTy() || isa<CallInst>(it)
            ||<br>
            +                            isa<InvokeInst>(it))) {<br>
            +      KeyNodes.insert(&*it);<br>
            +      bool OpsChanged = false;<br>
            +      if (<wbr>ShouldStartVectorizeHorAtStore ||
            !isa<StoreInst>(it)) {<br>
            +        for (auto *V : it->operand_values()) {<br>
            +          // Try to match and vectorize a horizontal
            reduction.<br>
            +          OpsChanged |= vectorizeRootInstruction(<wbr>nullptr,
            V, BB, R, TTI);<br>
                     }<br>
                   }<br>
            -    }<br>
            -<br>
            -    // Try to vectorize horizontal reductions feeding into
            a return.<br>
            -    if (ReturnInst *RI = dyn_cast<ReturnInst>(it)) {<br>
            -      if (RI->getNumOperands() != 0) {<br>
            -        // Try to match and vectorize a horizontal
            reduction.<br>
            -        if (vectorizeRootInstruction(<wbr>nullptr,
            RI->getOperand(0), BB, R, TTI)) {<br>
            -          Changed = true;<br>
            -          it = BB->begin();<br>
            -          e = BB->end();<br>
            -          continue;<br>
            -        }<br>
            -      }<br>
            -    }<br>
            -<br>
            -    // Try to vectorize trees that start at compare
            instructions.<br>
            -    if (CmpInst *CI = dyn_cast<CmpInst>(it)) {<br>
            -      if (tryToVectorizePair(CI-><wbr>getOperand(0),
            CI->getOperand(1), R)) {<br>
            -        Changed = true;<br>
            +      // Start vectorization of post-process list of
            instructions from the<br>
            +      // top-tree instructions to try to vectorize as many
            instructions as<br>
            +      // possible.<br>
            +      OpsChanged |= vectorizeSimpleInstructions(<wbr>PostProcessInstructions,
            BB, R);<br>
            +      if (OpsChanged) {<br>
                     // We would like to start over since some
            instructions are deleted<br>
                     // and the iterator may become invalid value.<br>
            -        it = BB->begin();<br>
            -        e = BB->end();<br>
            -        continue;<br>
            -      }<br>
            -<br>
            -      for (int I = 0; I < 2; ++I) {<br>
            -        if (vectorizeRootInstruction(<wbr>nullptr,
            CI->getOperand(I), BB, R, TTI)) {<br>
            -          Changed = true;<br>
            -          // We would like to start over since some
            instructions are deleted<br>
            -          // and the iterator may become invalid value.<br>
            -          it = BB->begin();<br>
            -          e = BB->end();<br>
            -          break;<br>
            -        }<br>
            -      }<br>
            -      continue;<br>
            -    }<br>
            -<br>
            -    // Try to vectorize trees that start at insertelement
            instructions.<br>
            -    if (InsertElementInst *FirstInsertElem =
            dyn_cast<InsertElementInst>(<wbr>it)) {<br>
            -      SmallVector<Value *, 16> BuildVector;<br>
            -      SmallVector<Value *, 16> BuildVectorOpds;<br>
            -      if (!findBuildVector(<wbr>FirstInsertElem,
            BuildVector, BuildVectorOpds))<br>
            -        continue;<br>
            -<br>
            -      // Vectorize starting with the build vector operands
            ignoring the<br>
            -      // BuildVector instructions for the purpose of
            scheduling and user<br>
            -      // extraction.<br>
            -      if (tryToVectorizeList(<wbr>BuildVectorOpds, R,
            BuildVector)) {<br>
                     Changed = true;<br>
                     it = BB->begin();<br>
                     e = BB->end();<br>
            +        continue;<br>
                   }<br>
            -<br>
            -      continue;<br>
                 }<br>
            <br>
            -    // Try to vectorize trees that start at insertvalue
            instructions feeding into<br>
            -    // a store.<br>
            -    if (StoreInst *SI = dyn_cast<StoreInst>(it)) {<br>
            -      if (InsertValueInst *LastInsertValue =
            dyn_cast<InsertValueInst>(SI-><wbr>getValueOperand()))
            {<br>
            -        const DataLayout &DL = BB->getModule()-><wbr>getDataLayout();<br>
            -        if (R.canMapToVector(SI-><wbr>getValueOperand()->getType(),
            DL)) {<br>
            -          SmallVector<Value *, 16> BuildVector;<br>
            -          SmallVector<Value *, 16> BuildVectorOpds;<br>
            -          if (!findBuildAggregate(<wbr>LastInsertValue,
            BuildVector, BuildVectorOpds))<br>
            -            continue;<br>
            +    if (isa<InsertElementInst>(it) ||
            isa<CmpInst>(it) ||<br>
            +        isa<InsertValueInst>(it))<br>
            +      PostProcessInstructions.push_<wbr>back(&*it);<br>
            <br>
            -          DEBUG(dbgs() << "SLP: store of array
            mappable to vector: " << *SI << "\n");<br>
            -          if (tryToVectorizeList(<wbr>BuildVectorOpds, R,
            BuildVector, false)) {<br>
            -            Changed = true;<br>
            -            it = BB->begin();<br>
            -            e = BB->end();<br>
            -          }<br>
            -          continue;<br>
            -        }<br>
            -      }<br>
            -    }<br>
               }<br>
            <br>
               return Changed;<br>
            <br>
            Modified: llvm/trunk/test/Transforms/<wbr>SLPVectorizer/AArch64/gather-<wbr>root.ll<br>
            URL: <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%2Fllvm%2Ftrunk%2Ftest%2FTransforms%2FSLPVectorizer%2FAArch64%2Fgather-root.ll%3Frev%3D310260%26r1%3D310259%26r2%3D310260%26view%3Ddiff&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=%2FuMvWeFN2f6Uyvrnaumecv52ssf2yCh%2FYtlOJ9Gi5V8%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/SLPVectorizer/<wbr>AArch64/gather-root.ll?rev=<wbr>310260&r1=310259&r2=310260&<wbr>view=diff</a><br>
            ==============================<wbr>==============================<wbr>==================<br>
            --- llvm/trunk/test/Transforms/<wbr>SLPVectorizer/AArch64/gather-<wbr>root.ll
            (original)<br>
            +++ llvm/trunk/test/Transforms/<wbr>SLPVectorizer/AArch64/gather-<wbr>root.ll
            Mon Aug  7 08:25:49 2017<br>
            @@ -31,10 +31,8 @@ define void @PR28330(i32 %n) {<br>
             ;<br>
             ; GATHER-LABEL: @PR28330(<br>
             ; GATHER-NEXT:  entry:<br>
            -; GATHER-NEXT:    [[TMP0:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 1), align 1<br>
            -; GATHER-NEXT:    [[TMP1:%.*]] = icmp eq i8 [[TMP0]], 0<br>
            -; GATHER-NEXT:    [[TMP2:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 2), align 2<br>
            -; GATHER-NEXT:    [[TMP3:%.*]] = icmp eq i8 [[TMP2]], 0<br>
            +; GATHER-NEXT:    [[TMP0:%.*]] = load <2 x i8>, <2
            x i8>* bitcast (i8* getelementptr inbounds ([80 x i8],
            [80 x i8]* @a, i64 0, i64 1) to <2 x i8>*), align 1<br>
            +; GATHER-NEXT:    [[TMP1:%.*]] = icmp eq <2 x i8>
            [[TMP0]], zeroinitializer<br>
             ; GATHER-NEXT:    [[TMP4:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 3), align 1<br>
             ; GATHER-NEXT:    [[TMP5:%.*]] = icmp eq i8 [[TMP4]], 0<br>
             ; GATHER-NEXT:    [[TMP6:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 4), align 4<br>
            @@ -50,10 +48,11 @@ define void @PR28330(i32 %n) {<br>
             ; GATHER-NEXT:    br label [[FOR_BODY:%.*]]<br>
             ; GATHER:       for.body:<br>
             ; GATHER-NEXT:    [[TMP17:%.*]] = phi i32 [
            [[BIN_EXTRA:%.*]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]<br>
            -; GATHER-NEXT:    [[TMP19:%.*]] = select i1 [[TMP1]], i32
            -720, i32 -80<br>
            -; GATHER-NEXT:    [[TMP20:%.*]] = add i32 [[TMP17]],
            [[TMP19]]<br>
            -; GATHER-NEXT:    [[TMP21:%.*]] = select i1 [[TMP3]], i32
            -720, i32 -80<br>
            -; GATHER-NEXT:    [[TMP22:%.*]] = add i32 [[TMP20]],
            [[TMP21]]<br>
            +; GATHER-NEXT:    [[TMP2:%.*]] = select <2 x i1>
            [[TMP1]], <2 x i32> <i32 -720, i32 -720>, <2
            x i32> <i32 -80, i32 -80><br>
            +; GATHER-NEXT:    [[TMP3:%.*]] = extractelement <2 x
            i32> [[TMP2]], i32 0<br>
            +; GATHER-NEXT:    [[TMP20:%.*]] = add i32 [[TMP17]],
            [[TMP3]]<br>
            +; GATHER-NEXT:    [[TMP4:%.*]] = extractelement <2 x
            i32> [[TMP2]], i32 1<br>
            +; GATHER-NEXT:    [[TMP22:%.*]] = add i32 [[TMP20]],
            [[TMP4]]<br>
             ; GATHER-NEXT:    [[TMP23:%.*]] = select i1 [[TMP5]], i32
            -720, i32 -80<br>
             ; GATHER-NEXT:    [[TMP24:%.*]] = add i32 [[TMP22]],
            [[TMP23]]<br>
             ; GATHER-NEXT:    [[TMP25:%.*]] = select i1 [[TMP7]], i32
            -720, i32 -80<br>
            @@ -65,16 +64,16 @@ define void @PR28330(i32 %n) {<br>
             ; GATHER-NEXT:    [[TMP31:%.*]] = select i1 [[TMP13]], i32
            -720, i32 -80<br>
             ; GATHER-NEXT:    [[TMP32:%.*]] = add i32 [[TMP30]],
            [[TMP31]]<br>
             ; GATHER-NEXT:    [[TMP33:%.*]] = select i1 [[TMP15]], i32
            -720, i32 -80<br>
            -; GATHER-NEXT:    [[TMP0:%.*]] = insertelement <8 x
            i32> undef, i32 [[TMP19]], i32 0<br>
            -; GATHER-NEXT:    [[TMP1:%.*]] = insertelement <8 x
            i32> [[TMP0]], i32 [[TMP21]], i32 1<br>
            -; GATHER-NEXT:    [[TMP2:%.*]] = insertelement <8 x
            i32> [[TMP1]], i32 [[TMP23]], i32 2<br>
            -; GATHER-NEXT:    [[TMP3:%.*]] = insertelement <8 x
            i32> [[TMP2]], i32 [[TMP25]], i32 3<br>
            -; GATHER-NEXT:    [[TMP4:%.*]] = insertelement <8 x
            i32> [[TMP3]], i32 [[TMP27]], i32 4<br>
            -; GATHER-NEXT:    [[TMP5:%.*]] = insertelement <8 x
            i32> [[TMP4]], i32 [[TMP29]], i32 5<br>
            -; GATHER-NEXT:    [[TMP6:%.*]] = insertelement <8 x
            i32> [[TMP5]], i32 [[TMP31]], i32 6<br>
            -; GATHER-NEXT:    [[TMP7:%.*]] = insertelement <8 x
            i32> [[TMP6]], i32 [[TMP33]], i32 7<br>
            -; GATHER-NEXT:    [[TMP8:%.*]] = call i32
            @llvm.experimental.vector.<wbr>reduce.add.i32.v8i32(<8 x
            i32> [[TMP7]])<br>
            -; GATHER-NEXT:    [[BIN_EXTRA]] = add i32 [[TMP8]],
            [[TMP17]]<br>
            +; GATHER-NEXT:    [[TMP5:%.*]] = insertelement <8 x
            i32> undef, i32 [[TMP3]], i32 0<br>
            +; GATHER-NEXT:    [[TMP6:%.*]] = insertelement <8 x
            i32> [[TMP5]], i32 [[TMP4]], i32 1<br>
            +; GATHER-NEXT:    [[TMP7:%.*]] = insertelement <8 x
            i32> [[TMP6]], i32 [[TMP23]], i32 2<br>
            +; GATHER-NEXT:    [[TMP8:%.*]] = insertelement <8 x
            i32> [[TMP7]], i32 [[TMP25]], i32 3<br>
            +; GATHER-NEXT:    [[TMP9:%.*]] = insertelement <8 x
            i32> [[TMP8]], i32 [[TMP27]], i32 4<br>
            +; GATHER-NEXT:    [[TMP10:%.*]] = insertelement <8 x
            i32> [[TMP9]], i32 [[TMP29]], i32 5<br>
            +; GATHER-NEXT:    [[TMP11:%.*]] = insertelement <8 x
            i32> [[TMP10]], i32 [[TMP31]], i32 6<br>
            +; GATHER-NEXT:    [[TMP12:%.*]] = insertelement <8 x
            i32> [[TMP11]], i32 [[TMP33]], i32 7<br>
            +; GATHER-NEXT:    [[TMP13:%.*]] = call i32
            @llvm.experimental.vector.<wbr>reduce.add.i32.v8i32(<8 x
            i32> [[TMP12]])<br>
            +; GATHER-NEXT:    [[BIN_EXTRA]] = add i32 [[TMP13]],
            [[TMP17]]<br>
             ; GATHER-NEXT:    [[TMP34:%.*]] = add i32 [[TMP32]],
            [[TMP33]]<br>
             ; GATHER-NEXT:    br label [[FOR_BODY]]<br>
             ;<br>
            @@ -180,10 +179,8 @@ define void @PR32038(i32 %n) {<br>
             ;<br>
             ; GATHER-LABEL: @PR32038(<br>
             ; GATHER-NEXT:  entry:<br>
            -; GATHER-NEXT:    [[TMP0:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 1), align 1<br>
            -; GATHER-NEXT:    [[TMP1:%.*]] = icmp eq i8 [[TMP0]], 0<br>
            -; GATHER-NEXT:    [[TMP2:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 2), align 2<br>
            -; GATHER-NEXT:    [[TMP3:%.*]] = icmp eq i8 [[TMP2]], 0<br>
            +; GATHER-NEXT:    [[TMP0:%.*]] = load <2 x i8>, <2
            x i8>* bitcast (i8* getelementptr inbounds ([80 x i8],
            [80 x i8]* @a, i64 0, i64 1) to <2 x i8>*), align 1<br>
            +; GATHER-NEXT:    [[TMP1:%.*]] = icmp eq <2 x i8>
            [[TMP0]], zeroinitializer<br>
             ; GATHER-NEXT:    [[TMP4:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 3), align 1<br>
             ; GATHER-NEXT:    [[TMP5:%.*]] = icmp eq i8 [[TMP4]], 0<br>
             ; GATHER-NEXT:    [[TMP6:%.*]] = load i8, i8* getelementptr
            inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64 4), align 4<br>
            @@ -199,10 +196,11 @@ define void @PR32038(i32 %n) {<br>
             ; GATHER-NEXT:    br label [[FOR_BODY:%.*]]<br>
             ; GATHER:       for.body:<br>
             ; GATHER-NEXT:    [[TMP17:%.*]] = phi i32 [
            [[BIN_EXTRA:%.*]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]<br>
            -; GATHER-NEXT:    [[TMP19:%.*]] = select i1 [[TMP1]], i32
            -720, i32 -80<br>
            -; GATHER-NEXT:    [[TMP20:%.*]] = add i32 -5, [[TMP19]]<br>
            -; GATHER-NEXT:    [[TMP21:%.*]] = select i1 [[TMP3]], i32
            -720, i32 -80<br>
            -; GATHER-NEXT:    [[TMP22:%.*]] = add i32 [[TMP20]],
            [[TMP21]]<br>
            +; GATHER-NEXT:    [[TMP2:%.*]] = select <2 x i1>
            [[TMP1]], <2 x i32> <i32 -720, i32 -720>, <2
            x i32> <i32 -80, i32 -80><br>
            +; GATHER-NEXT:    [[TMP3:%.*]] = extractelement <2 x
            i32> [[TMP2]], i32 0<br>
            +; GATHER-NEXT:    [[TMP20:%.*]] = add i32 -5, [[TMP3]]<br>
            +; GATHER-NEXT:    [[TMP4:%.*]] = extractelement <2 x
            i32> [[TMP2]], i32 1<br>
            +; GATHER-NEXT:    [[TMP22:%.*]] = add i32 [[TMP20]],
            [[TMP4]]<br>
             ; GATHER-NEXT:    [[TMP23:%.*]] = select i1 [[TMP5]], i32
            -720, i32 -80<br>
             ; GATHER-NEXT:    [[TMP24:%.*]] = add i32 [[TMP22]],
            [[TMP23]]<br>
             ; GATHER-NEXT:    [[TMP25:%.*]] = select i1 [[TMP7]], i32
            -720, i32 -80<br>
            @@ -214,29 +212,27 @@ define void @PR32038(i32 %n) {<br>
             ; GATHER-NEXT:    [[TMP31:%.*]] = select i1 [[TMP13]], i32
            -720, i32 -80<br>
             ; GATHER-NEXT:    [[TMP32:%.*]] = add i32 [[TMP30]],
            [[TMP31]]<br>
             ; GATHER-NEXT:    [[TMP33:%.*]] = select i1 [[TMP15]], i32
            -720, i32 -80<br>
            -; GATHER-NEXT:    [[TMP0:%.*]] = insertelement <8 x
            i32> undef, i32 [[TMP19]], i32 0<br>
            -; GATHER-NEXT:    [[TMP1:%.*]] = insertelement <8 x
            i32> [[TMP0]], i32 [[TMP21]], i32 1<br>
            -; GATHER-NEXT:    [[TMP2:%.*]] = insertelement <8 x
            i32> [[TMP1]], i32 [[TMP23]], i32 2<br>
            -; GATHER-NEXT:    [[TMP3:%.*]] = insertelement <8 x
            i32> [[TMP2]], i32 [[TMP25]], i32 3<br>
            -; GATHER-NEXT:    [[TMP4:%.*]] = insertelement <8 x
            i32> [[TMP3]], i32 [[TMP27]], i32 4<br>
            -; GATHER-NEXT:    [[TMP5:%.*]] = insertelement <8 x
            i32> [[TMP4]], i32 [[TMP29]], i32 5<br>
            -; GATHER-NEXT:    [[TMP6:%.*]] = insertelement <8 x
            i32> [[TMP5]], i32 [[TMP31]], i32 6<br>
            -; GATHER-NEXT:    [[TMP7:%.*]] = insertelement <8 x
            i32> [[TMP6]], i32 [[TMP33]], i32 7<br>
            -; GATHER-NEXT:    [[TMP8:%.*]] = call i32
            @llvm.experimental.vector.<wbr>reduce.add.i32.v8i32(<8 x
            i32> [[TMP7]])<br>
            -; GATHER-NEXT:    [[BIN_EXTRA]] = add i32 [[TMP8]], -5<br>
            +; GATHER-NEXT:    [[TMP5:%.*]] = insertelement <8 x
            i32> undef, i32 [[TMP3]], i32 0<br>
            +; GATHER-NEXT:    [[TMP6:%.*]] = insertelement <8 x
            i32> [[TMP5]], i32 [[TMP4]], i32 1<br>
            +; GATHER-NEXT:    [[TMP7:%.*]] = insertelement <8 x
            i32> [[TMP6]], i32 [[TMP23]], i32 2<br>
            +; GATHER-NEXT:    [[TMP8:%.*]] = insertelement <8 x
            i32> [[TMP7]], i32 [[TMP25]], i32 3<br>
            +; GATHER-NEXT:    [[TMP9:%.*]] = insertelement <8 x
            i32> [[TMP8]], i32 [[TMP27]], i32 4<br>
            +; GATHER-NEXT:    [[TMP10:%.*]] = insertelement <8 x
            i32> [[TMP9]], i32 [[TMP29]], i32 5<br>
            +; GATHER-NEXT:    [[TMP11:%.*]] = insertelement <8 x
            i32> [[TMP10]], i32 [[TMP31]], i32 6<br>
            +; GATHER-NEXT:    [[TMP12:%.*]] = insertelement <8 x
            i32> [[TMP11]], i32 [[TMP33]], i32 7<br>
            +; GATHER-NEXT:    [[TMP13:%.*]] = call i32
            @llvm.experimental.vector.<wbr>reduce.add.i32.v8i32(<8 x
            i32> [[TMP12]])<br>
            +; GATHER-NEXT:    [[BIN_EXTRA]] = add i32 [[TMP13]], -5<br>
             ; GATHER-NEXT:    [[TMP34:%.*]] = add i32 [[TMP32]],
            [[TMP33]]<br>
             ; GATHER-NEXT:    br label [[FOR_BODY]]<br>
             ;<br>
             ; MAX-COST-LABEL: @PR32038(<br>
             ; MAX-COST-NEXT:  entry:<br>
            -; MAX-COST-NEXT:    [[TMP0:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            1), align 1<br>
            -; MAX-COST-NEXT:    [[TMP1:%.*]] = icmp eq i8 [[TMP0]], 0<br>
            -; MAX-COST-NEXT:    [[TMP2:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            2), align 2<br>
            -; MAX-COST-NEXT:    [[TMP3:%.*]] = icmp eq i8 [[TMP2]], 0<br>
            +; MAX-COST-NEXT:    [[TMP0:%.*]] = load <2 x i8>,
            <2 x i8>* bitcast (i8* getelementptr inbounds ([80 x
            i8], [80 x i8]* @a, i64 0, i64 1) to <2 x i8>*), align
            1<br>
            +; MAX-COST-NEXT:    [[TMP1:%.*]] = icmp eq <2 x i8>
            [[TMP0]], zeroinitializer<br>
             ; MAX-COST-NEXT:    [[TMP4:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            3), align 1<br>
            -; MAX-COST-NEXT:    [[TMP5:%.*]] = icmp eq i8 [[TMP4]], 0<br>
            +; MAX-COST-NEXT:    [[TMPP5:%.*]] = icmp eq i8 [[TMP4]], 0<br>
             ; MAX-COST-NEXT:    [[TMP6:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            4), align 4<br>
            -; MAX-COST-NEXT:    [[TMP7:%.*]] = icmp eq i8 [[TMP6]], 0<br>
            +; MAX-COST-NEXT:    [[TMPP7:%.*]] = icmp eq i8 [[TMP6]], 0<br>
             ; MAX-COST-NEXT:    [[TMP8:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            5), align 1<br>
             ; MAX-COST-NEXT:    [[TMP9:%.*]] = icmp eq i8 [[TMP8]], 0<br>
             ; MAX-COST-NEXT:    [[TMP10:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            6), align 2<br>
            @@ -245,14 +241,16 @@ define void @PR32038(i32 %n) {<br>
             ; MAX-COST-NEXT:    [[TMP13:%.*]] = icmp eq i8 [[TMP12]], 0<br>
             ; MAX-COST-NEXT:    [[TMP14:%.*]] = load i8, i8*
            getelementptr inbounds ([80 x i8], [80 x i8]* @a, i64 0, i64
            8), align 8<br>
             ; MAX-COST-NEXT:    [[TMP15:%.*]] = icmp eq i8 [[TMP14]], 0<br>
            -; MAX-COST-NEXT:    [[TMP0:%.*]] = insertelement <4 x
            i1> undef, i1 [[TMP1]], i32 0<br>
            -; MAX-COST-NEXT:    [[TMP1:%.*]] = insertelement <4 x
            i1> [[TMP0]], i1 [[TMP3]], i32 1<br>
            -; MAX-COST-NEXT:    [[TMP2:%.*]] = insertelement <4 x
            i1> [[TMP1]], i1 [[TMP5]], i32 2<br>
            -; MAX-COST-NEXT:    [[TMP3:%.*]] = insertelement <4 x
            i1> [[TMP2]], i1 [[TMP7]], i32 3<br>
             ; MAX-COST-NEXT:    br label [[FOR_BODY:%.*]]<br>
             ; MAX-COST:       for.body:<br>
             ; MAX-COST-NEXT:    [[TMP17:%.*]] = phi i32 [
            [[TMP34:%.*]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]<br>
            -; MAX-COST-NEXT:    [[TMP4:%.*]] = select <4 x i1>
            [[TMP3]], <4 x i32> <i32 -720, i32 -720, i32 -720,
            i32 -720>, <4 x i32> <i32 -80, i32 -80, i32 -80,
            i32 -80><br>
            +; MAX-COST-NEXT:    [[TMP2:%.*]] = extractelement <2 x
            i1> [[TMP1]], i32 0<br>
            +; MAX-COST-NEXT:    [[TMP3:%.*]] = insertelement <4 x
            i1> undef, i1 [[TMP2]], i32 0<br>
            +; MAX-COST-NEXT:    [[TMP4:%.*]] = extractelement <2 x
            i1> [[TMP1]], i32 1<br>
            +; MAX-COST-NEXT:    [[TMP5:%.*]] = insertelement <4 x
            i1> [[TMP3]], i1 [[TMP4]], i32 1<br>
            +; MAX-COST-NEXT:    [[TMP6:%.*]] = insertelement <4 x
            i1> [[TMP5]], i1 [[TMPP5]], i32 2<br>
            +; MAX-COST-NEXT:    [[TMP7:%.*]] = insertelement <4 x
            i1> [[TMP6]], i1 [[TMPP7]], i32 3<br>
            +; MAX-COST-NEXT:    [[TMP8:%.*]] = select <4 x i1>
            [[TMP7]], <4 x i32> <i32 -720, i32 -720, i32 -720,
            i32 -720>, <4 x i32> <i32 -80, i32 -80, i32 -80,
            i32 -80><br>
             ; MAX-COST-NEXT:    [[TMP20:%.*]] = add i32 -5, undef<br>
             ; MAX-COST-NEXT:    [[TMP22:%.*]] = add i32 [[TMP20]],
            undef<br>
             ; MAX-COST-NEXT:    [[TMP24:%.*]] = add i32 [[TMP22]],
            undef<br>
            @@ -260,10 +258,10 @@ define void @PR32038(i32 %n) {<br>
             ; MAX-COST-NEXT:    [[TMP27:%.*]] = select i1 [[TMP9]], i32
            -720, i32 -80<br>
             ; MAX-COST-NEXT:    [[TMP28:%.*]] = add i32 [[TMP26]],
            [[TMP27]]<br>
             ; MAX-COST-NEXT:    [[TMP29:%.*]] = select i1 [[TMP11]],
            i32 -720, i32 -80<br>
            -; MAX-COST-NEXT:    [[TMP5:%.*]] = call i32
            @llvm.experimental.vector.<wbr>reduce.add.i32.v4i32(<4 x
            i32> [[TMP4]])<br>
            -; MAX-COST-NEXT:    [[TMP6:%.*]] = add i32 [[TMP5]],
            [[TMP27]]<br>
            -; MAX-COST-NEXT:    [[TMP7:%.*]] = add i32 [[TMP6]],
            [[TMP29]]<br>
            -; MAX-COST-NEXT:    [[BIN_EXTRA:%.*]] = add i32 [[TMP7]],
            -5<br>
            +; MAX-COST-NEXT:    [[TMP9:%.*]] = call i32
            @llvm.experimental.vector.<wbr>reduce.add.i32.v4i32(<4 x
            i32> [[TMP8]])<br>
            +; MAX-COST-NEXT:    [[TMP10:%.*]] = add i32 [[TMP9]],
            [[TMP27]]<br>
            +; MAX-COST-NEXT:    [[TMP11:%.*]] = add i32 [[TMP10]],
            [[TMP29]]<br>
            +; MAX-COST-NEXT:    [[BIN_EXTRA:%.*]] = add i32 [[TMP11]],
            -5<br>
             ; MAX-COST-NEXT:    [[TMP30:%.*]] = add i32 [[TMP28]],
            [[TMP29]]<br>
             ; MAX-COST-NEXT:    [[TMP31:%.*]] = select i1 [[TMP13]],
            i32 -720, i32 -80<br>
             ; MAX-COST-NEXT:    [[TMP32:%.*]] = add i32 [[BIN_EXTRA]],
            [[TMP31]]<br>
            <br>
            Modified: llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/horizontal.<wbr>ll<br>
            URL: <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%2Fllvm%2Ftrunk%2Ftest%2FTransforms%2FSLPVectorizer%2FX86%2Fhorizontal.ll%3Frev%3D310260%26r1%3D310259%26r2%3D310260%26view%3Ddiff&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=XdutjKbtgqc%2Fe4uTrzJEIDtGmnYMTi5s%2FgMaqAi4paM%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/SLPVectorizer/X86/<wbr>horizontal.ll?rev=310260&r1=<wbr>310259&r2=310260&view=diff</a><br>
            ==============================<wbr>==============================<wbr>==================<br>
            --- llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/horizontal.<wbr>ll
            (original)<br>
            +++ llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/horizontal.<wbr>ll
            Mon Aug  7 08:25:49 2017<br>
            @@ -817,22 +817,22 @@ declare i32 @foobar(i32)<br>
             define void @i32_red_call(i32 %val) {<br>
             ; CHECK-LABEL: @i32_red_call(<br>
             ; CHECK-NEXT:  entry:<br>
            -; CHECK-NEXT:    [[TMP0:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 0), align 16<br>
            -; CHECK-NEXT:    [[TMP1:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 1), align 4<br>
            -; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP1]],
            [[TMP0]]<br>
            -; CHECK-NEXT:    [[TMP2:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 2), align 8<br>
            -; CHECK-NEXT:    [[ADD_1:%.*]] = add nsw i32 [[TMP2]],
            [[ADD]]<br>
            -; CHECK-NEXT:    [[TMP3:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 3), align 4<br>
            -; CHECK-NEXT:    [[ADD_2:%.*]] = add nsw i32 [[TMP3]],
            [[ADD_1]]<br>
            -; CHECK-NEXT:    [[TMP4:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 4), align 16<br>
            -; CHECK-NEXT:    [[ADD_3:%.*]] = add nsw i32 [[TMP4]],
            [[ADD_2]]<br>
            -; CHECK-NEXT:    [[TMP5:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 5), align 4<br>
            -; CHECK-NEXT:    [[ADD_4:%.*]] = add nsw i32 [[TMP5]],
            [[ADD_3]]<br>
            -; CHECK-NEXT:    [[TMP6:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 6), align 8<br>
            -; CHECK-NEXT:    [[ADD_5:%.*]] = add nsw i32 [[TMP6]],
            [[ADD_4]]<br>
            -; CHECK-NEXT:    [[TMP7:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 7), align 4<br>
            -; CHECK-NEXT:    [[ADD_6:%.*]] = add nsw i32 [[TMP7]],
            [[ADD_5]]<br>
            -; CHECK-NEXT:    [[RES:%.*]] = call i32 @foobar(i32
            [[ADD_6]])<br>
            +; CHECK-NEXT:    [[TMP0:%.*]] = load <8 x i32>, <8
            x i32>* bitcast ([32 x i32]* @arr_i32 to <8 x
            i32>*), align 16<br>
            +; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 undef, undef<br>
            +; CHECK-NEXT:    [[ADD_1:%.*]] = add nsw i32 undef, [[ADD]]<br>
            +; CHECK-NEXT:    [[ADD_2:%.*]] = add nsw i32 undef,
            [[ADD_1]]<br>
            +; CHECK-NEXT:    [[ADD_3:%.*]] = add nsw i32 undef,
            [[ADD_2]]<br>
            +; CHECK-NEXT:    [[ADD_4:%.*]] = add nsw i32 undef,
            [[ADD_3]]<br>
            +; CHECK-NEXT:    [[ADD_5:%.*]] = add nsw i32 undef,
            [[ADD_4]]<br>
            +; CHECK-NEXT:    [[RDX_SHUF:%.*]] = shufflevector <8 x
            i32> [[TMP0]], <8 x i32> undef, <8 x i32>
            <i32 4, i32 5, i32 6, i32 7, i32 undef, i32 undef, i32
            undef, i32 undef><br>
            +; CHECK-NEXT:    [[BIN_RDX:%.*]] = add nsw <8 x i32>
            [[TMP0]], [[RDX_SHUF]]<br>
            +; CHECK-NEXT:    [[RDX_SHUF1:%.*]] = shufflevector <8 x
            i32> [[BIN_RDX]], <8 x i32> undef, <8 x i32>
            <i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32
            undef, i32 undef, i32 undef><br>
            +; CHECK-NEXT:    [[BIN_RDX2:%.*]] = add nsw <8 x i32>
            [[BIN_RDX]], [[RDX_SHUF1]]<br>
            +; CHECK-NEXT:    [[RDX_SHUF3:%.*]] = shufflevector <8 x
            i32> [[BIN_RDX2]], <8 x i32> undef, <8 x i32>
            <i32 1, i32 undef, i32 undef, i32 undef, i32 undef, i32
            undef, i32 undef, i32 undef><br>
            +; CHECK-NEXT:    [[BIN_RDX4:%.*]] = add nsw <8 x i32>
            [[BIN_RDX2]], [[RDX_SHUF3]]<br>
            +; CHECK-NEXT:    [[TMP1:%.*]] = extractelement <8 x
            i32> [[BIN_RDX4]], i32 0<br>
            +; CHECK-NEXT:    [[ADD_6:%.*]] = add nsw i32 undef,
            [[ADD_5]]<br>
            +; CHECK-NEXT:    [[RES:%.*]] = call i32 @foobar(i32
            [[TMP1]])<br>
             ; CHECK-NEXT:    ret void<br>
             ;<br>
             entry:<br>
            @@ -858,22 +858,22 @@ entry:<br>
             define void @i32_red_invoke(i32 %val) personality i32
            (...)* @__gxx_personality_v0 {<br>
             ; CHECK-LABEL: @i32_red_invoke(<br>
             ; CHECK-NEXT:  entry:<br>
            -; CHECK-NEXT:    [[TMP0:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 0), align 16<br>
            -; CHECK-NEXT:    [[TMP1:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 1), align 4<br>
            -; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP1]],
            [[TMP0]]<br>
            -; CHECK-NEXT:    [[TMP2:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 2), align 8<br>
            -; CHECK-NEXT:    [[ADD_1:%.*]] = add nsw i32 [[TMP2]],
            [[ADD]]<br>
            -; CHECK-NEXT:    [[TMP3:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 3), align 4<br>
            -; CHECK-NEXT:    [[ADD_2:%.*]] = add nsw i32 [[TMP3]],
            [[ADD_1]]<br>
            -; CHECK-NEXT:    [[TMP4:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 4), align 16<br>
            -; CHECK-NEXT:    [[ADD_3:%.*]] = add nsw i32 [[TMP4]],
            [[ADD_2]]<br>
            -; CHECK-NEXT:    [[TMP5:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 5), align 4<br>
            -; CHECK-NEXT:    [[ADD_4:%.*]] = add nsw i32 [[TMP5]],
            [[ADD_3]]<br>
            -; CHECK-NEXT:    [[TMP6:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 6), align 8<br>
            -; CHECK-NEXT:    [[ADD_5:%.*]] = add nsw i32 [[TMP6]],
            [[ADD_4]]<br>
            -; CHECK-NEXT:    [[TMP7:%.*]] = load i32, i32*
            getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32,
            i64 0, i64 7), align 4<br>
            -; CHECK-NEXT:    [[ADD_6:%.*]] = add nsw i32 [[TMP7]],
            [[ADD_5]]<br>
            -; CHECK-NEXT:    [[RES:%.*]] = invoke i32 @foobar(i32
            [[ADD_6]])<br>
            +; CHECK-NEXT:    [[TMP0:%.*]] = load <8 x i32>, <8
            x i32>* bitcast ([32 x i32]* @arr_i32 to <8 x
            i32>*), align 16<br>
            +; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 undef, undef<br>
            +; CHECK-NEXT:    [[ADD_1:%.*]] = add nsw i32 undef, [[ADD]]<br>
            +; CHECK-NEXT:    [[ADD_2:%.*]] = add nsw i32 undef,
            [[ADD_1]]<br>
            +; CHECK-NEXT:    [[ADD_3:%.*]] = add nsw i32 undef,
            [[ADD_2]]<br>
            +; CHECK-NEXT:    [[ADD_4:%.*]] = add nsw i32 undef,
            [[ADD_3]]<br>
            +; CHECK-NEXT:    [[ADD_5:%.*]] = add nsw i32 undef,
            [[ADD_4]]<br>
            +; CHECK-NEXT:    [[RDX_SHUF:%.*]] = shufflevector <8 x
            i32> [[TMP0]], <8 x i32> undef, <8 x i32>
            <i32 4, i32 5, i32 6, i32 7, i32 undef, i32 undef, i32
            undef, i32 undef><br>
            +; CHECK-NEXT:    [[BIN_RDX:%.*]] = add nsw <8 x i32>
            [[TMP0]], [[RDX_SHUF]]<br>
            +; CHECK-NEXT:    [[RDX_SHUF1:%.*]] = shufflevector <8 x
            i32> [[BIN_RDX]], <8 x i32> undef, <8 x i32>
            <i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32
            undef, i32 undef, i32 undef><br>
            +; CHECK-NEXT:    [[BIN_RDX2:%.*]] = add nsw <8 x i32>
            [[BIN_RDX]], [[RDX_SHUF1]]<br>
            +; CHECK-NEXT:    [[RDX_SHUF3:%.*]] = shufflevector <8 x
            i32> [[BIN_RDX2]], <8 x i32> undef, <8 x i32>
            <i32 1, i32 undef, i32 undef, i32 undef, i32 undef, i32
            undef, i32 undef, i32 undef><br>
            +; CHECK-NEXT:    [[BIN_RDX4:%.*]] = add nsw <8 x i32>
            [[BIN_RDX2]], [[RDX_SHUF3]]<br>
            +; CHECK-NEXT:    [[TMP1:%.*]] = extractelement <8 x
            i32> [[BIN_RDX4]], i32 0<br>
            +; CHECK-NEXT:    [[ADD_6:%.*]] = add nsw i32 undef,
            [[ADD_5]]<br>
            +; CHECK-NEXT:    [[RES:%.*]] = invoke i32 @foobar(i32
            [[TMP1]])<br>
             ; CHECK-NEXT:    to label [[NORMAL:%.*]] unwind label
            [[EXCEPTION:%.*]]<br>
             ; CHECK:       exception:<br>
             ; CHECK-NEXT:    [[CLEANUP:%.*]] = landingpad i8<br>
            <br>
            Modified: llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/insert-<wbr>element-build-vector.ll<br>
            URL: <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%2Fllvm%2Ftrunk%2Ftest%2FTransforms%2FSLPVectorizer%2FX86%2Finsert-element-build-vector.ll%3Frev%3D310260%26r1%3D310259%26r2%3D310260%26view%3Ddiff&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=JJIRGosGW%2FUpQ8W6kCm5amTOa7QffYc4D5TZUXP%2BfdA%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://llvm.org/viewvc/llvm-<wbr>project/llvm/trunk/test/<wbr>Transforms/SLPVectorizer/X86/<wbr>insert-element-build-vector.<wbr>ll?rev=310260&r1=310259&r2=<wbr>310260&view=diff</a><br>
            ==============================<wbr>==============================<wbr>==================<br>
            --- llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/insert-<wbr>element-build-vector.ll
            (original)<br>
            +++ llvm/trunk/test/Transforms/<wbr>SLPVectorizer/X86/insert-<wbr>element-build-vector.ll
            Mon Aug  7 08:25:49 2017<br>
            @@ -303,24 +303,30 @@ define <4 x float>
            @simple_select_no_use<br>
             ; CHECK-NEXT:    [[B1:%.*]] = extractelement <4 x
            float> %b, i32 1<br>
             ; CHECK-NEXT:    [[B2:%.*]] = extractelement <4 x
            float> %b, i32 2<br>
             ; CHECK-NEXT:    [[B3:%.*]] = extractelement <4 x
            float> %b, i32 3<br>
            -; CHECK-NEXT:    [[CMP0:%.*]] = icmp ne i32 [[C0]], 0<br>
            -; CHECK-NEXT:    [[CMP1:%.*]] = icmp ne i32 [[C1]], 0<br>
            -; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x
            i32> undef, i32 [[C2]], i32 0<br>
            -; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x
            i32> [[TMP1]], i32 [[C3]], i32 1<br>
            +; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x
            i32> undef, i32 [[C0]], i32 0<br>
            +; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x
            i32> [[TMP1]], i32 [[C1]], i32 1<br>
             ; CHECK-NEXT:    [[TMP3:%.*]] = icmp ne <2 x i32>
            [[TMP2]], zeroinitializer<br>
            -; CHECK-NEXT:    [[S0:%.*]] = select i1 [[CMP0]], float
            [[A0]], float [[B0]]<br>
            -; CHECK-NEXT:    [[S1:%.*]] = select i1 [[CMP1]], float
            [[A1]], float [[B1]]<br>
            -; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <2 x
            float> undef, float [[A2]], i32 0<br>
            -; CHECK-NEXT:    [[TMP5:%.*]] = insertelement <2 x
            float> [[TMP4]], float [[A3]], i32 1<br>
            -; CHECK-NEXT:    [[TMP6:%.*]] = insertelement <2 x
            float> undef, float [[B2]], i32 0<br>
            -; CHECK-NEXT:    [[TMP7:%.*]] = insertelement <2 x
            float> [[TMP6]], float [[B3]], i32 1<br>
            -; CHECK-NEXT:    [[TMP8:%.*]] = select <2 x i1>
            [[TMP3]], <2 x float> [[TMP5]], <2 x float>
            [[TMP7]]<br>
            -; CHECK-NEXT:    [[RA:%.*]] = insertelement <4 x
            float> undef, float [[S0]], i32 0<br>
            -; CHECK-NEXT:    [[RB:%.*]] = insertelement <4 x
            float> [[RA]], float [[S1]], i32 1<br>
            -; CHECK-NEXT:    [[TMP9:%.*]] = extractelement <2 x
            float> [[TMP8]], i32 0<br>
            -; CHECK-NEXT:    [[RC:%.*]] = insertelement <4 x
            float> undef, float [[TMP9]], i32 2<br>
            -; CHECK-NEXT:    [[TMP10:%.*]] = extractelement <2 x
            float> [[TMP8]], i32 1<br>
            -; CHECK-NEXT:    [[RD:%.*]] = insertelement <4 x
            float> [[RC]], float [[TMP10]], i32 3<br>
            +; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <2 x
            i32> undef, i32 [[C2]], i32 0<br>
            +; CHECK-NEXT:    [[TMP5:%.*]] = insertelement <2 x
            i32> [[TMP4]], i32 [[C3]], i32 1<br>
            +; CHECK-NEXT:    [[TMP6:%.*]] = icmp ne <2 x i32>
            [[TMP5]], zeroinitializer<br>
            +; CHECK-NEXT:    [[TMP7:%.*]] = insertelement <2 x
            float> undef, float [[A0]], i32 0<br>
            +; CHECK-NEXT:    [[TMP8:%.*]] = insertelement <2 x
            float> [[TMP7]], float [[A1]], i32 1<br>
            +; CHECK-NEXT:    [[TMP9:%.*]] = insertelement <2 x
            float> undef, float [[B0]], i32 0<br>
            +; CHECK-NEXT:    [[TMP10:%.*]] = insertelement <2 x
            float> [[TMP9]], float [[B1]], i32 1<br>
            +; CHECK-NEXT:    [[TMP11:%.*]] = select <2 x i1>
            [[TMP3]], <2 x float> [[TMP8]], <2 x float>
            [[TMP10]]<br>
            +; CHECK-NEXT:    [[TMP12:%.*]] = insertelement <2 x
            float> undef, float [[A2]], i32 0<br>
            +; CHECK-NEXT:    [[TMP13:%.*]] = insertelement <2 x
            float> [[TMP12]], float [[A3]], i32 1<br>
            +; CHECK-NEXT:    [[TMP14:%.*]] = insertelement <2 x
            float> undef, float [[B2]], i32 0<br>
            +; CHECK-NEXT:    [[TMP15:%.*]] = insertelement <2 x
            float> [[TMP14]], float [[B3]], i32 1<br>
            +; CHECK-NEXT:    [[TMP16:%.*]] = select <2 x i1>
            [[TMP6]], <2 x float> [[TMP13]], <2 x float>
            [[TMP15]]<br>
            +; CHECK-NEXT:    [[TMP17:%.*]] = extractelement <2 x
            float> [[TMP11]], i32 0<br>
            +; CHECK-NEXT:    [[RA:%.*]] = insertelement <4 x
            float> undef, float [[TMP17]], i32 0<br>
            +; CHECK-NEXT:    [[TMP18:%.*]] = extractelement <2 x
            float> [[TMP11]], i32 1<br>
            +; CHECK-NEXT:    [[RB:%.*]] = insertelement <4 x
            float> [[RA]], float [[TMP18]], i32 1<br>
            +; CHECK-NEXT:    [[TMP19:%.*]] = extractelement <2 x
            float> [[TMP16]], i32 0<br>
            +; CHECK-NEXT:    [[RC:%.*]] = insertelement <4 x
            float> undef, float [[TMP19]], i32 2<br>
            +; CHECK-NEXT:    [[TMP20:%.*]] = extractelement <2 x
            float> [[TMP16]], i32 1<br>
            +; CHECK-NEXT:    [[RD:%.*]] = insertelement <4 x
            float> [[RC]], float [[TMP20]], i32 3<br>
             ; CHECK-NEXT:    ret <4 x float> [[RD]]<br>
             ;<br>
             ; ZEROTHRESH-LABEL: @simple_select_no_users(<br>
            <br>
            <br>
            ______________________________<wbr>_________________<br>
            llvm-commits mailing list<br>
            <a href="mailto:llvm-commits@lists.llvm.org"
              moz-do-not-send="true">llvm-commits@lists.llvm.org</a><br>
            <a
href="https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-commits&data=02%7C01%7C%7Cbd962e35170c4f8c943f08d556fc8a85%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636510565010290731&sdata=%2Fgk4OBA0VU4DLm4HwRI3ye1htQAW7RVX%2BE6%2BN1O4oNw%3D&reserved=0"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>