<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jan 23, 2017, at 3:48 PM, Sanjay Patel via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class="">All targets are likely affected in some way by the icmp+shl fold introduced with r292492. It's a basic pattern that occurs in lots of code. Did you see any perf wins on your targets with this commit?<br class=""><br class=""></div>Sadly, it is also likely that many (all?) targets are negatively impacted on the particular test (SingleSource/Benchmarks/Shoot<wbr class="">out/sieve) that you have pointed out here because the IR is now decidedly worse.<br class=""><br class="">IMO, we should not revert the commit because it exposed shortcomings in the optimizer. It's an "obvious" fold/canonicalization, and the related 'nuw' variant of this fold has existed in trunk since:<br class=""><a href="https://reviews.llvm.org/rL285729" class="">https://reviews.llvm.org/rL285729</a><br class=""><br class=""></div>We need to dissect what analysis/folds are missing to restore the IR to the better form that existed before, but this is probably going to be a long process because we treat min/max like an optimization fence. <br class=""></div></div></blockquote><div><br class=""></div><div>If this is gonna be a long process to recover, this looks like something to be reverted in the 4.0 branch (unless I missed that there is a correctness fix involved?).</div><div><br class=""></div><div>— </div><div>Mehdi</div><div><br class=""></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><br class=""><div class=""><br class=""></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Mon, Jan 23, 2017 at 11:13 AM, Evgeny Astigeevich <span dir="ltr" class=""><<a href="mailto:Evgeny.Astigeevich@arm.com" target="_blank" class="">Evgeny.Astigeevich@arm.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div link="blue" vlink="purple" lang="EN-US" class="">
<div class="m_5791940775744498920WordSection1"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Confirm there is no change in IR if the hack is disabled in the sources.<u class=""></u><u class=""></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">David wrote that these instructions are created by SCEV.<u class=""></u><u class=""></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Are other targets affected by the changes, e.g. X86?<u class=""></u><u class=""></u></span></p><span class=""><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><u class=""></u> <u class=""></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Kind regards,<br class="">
Evgeny Astigeevich<br class="">
Senior Compiler Engineer<br class="">
Compilation Tools<br class="">
ARM</span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><u class=""></u><u class=""></u></span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><u class=""></u> <u class=""></u></span></p>
</span><div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt" class="">
<div class="">
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm" class=""><p class="MsoNormal"><b class=""><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" class="">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" class=""> Sanjay Patel [mailto:<a href="mailto:spatel@rotateright.com" target="_blank" class="">spatel@rotateright.com</a><wbr class="">]
<br class="">
<b class="">Sent:</b> Sunday, January 22, 2017 10:45 PM</span></p><div class=""><div class="h5"><br class="">
<b class="">To:</b> Evgeny Astigeevich<br class="">
<b class="">Cc:</b> llvm-dev; nd<br class="">
<b class="">Subject:</b> Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines<u class=""></u><u class=""></u></div></div><div class=""><br class="webkit-block-placeholder"></div>
</div>
</div><div class=""><div class="h5"><p class="MsoNormal"><u class=""></u> <u class=""></u></p>
<div class=""><p class="MsoNormal">I tried an experiment to remove the integer min/max bailouts from InstCombine, and it doesn't appear to change the IR in the attachment, so I doubt there's going to be any improvement.<br class="">
<br class="">
If I haven't messed up this example, this is amazing:<br class="">
<a href="https://godbolt.org/g/yzoxeY" target="_blank" class="">https://godbolt.org/g/yzoxeY</a><u class=""></u><u class=""></u></p>
</div>
<div class=""><p class="MsoNormal"><u class=""></u> <u class=""></u></p>
<div class=""><p class="MsoNormal">On Sun, Jan 22, 2017 at 1:06 PM, Evgeny Astigeevich <<a href="mailto:Evgeny.Astigeevich@arm.com" target="_blank" class="">Evgeny.Astigeevich@arm.com</a>> wrote:<u class=""></u><u class=""></u></p>
<div class="">
<div class=""><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Thank you for information.</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">I’ll build clang without the hack and re-run the benchmark tomorrow.</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">-Evgeny</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt" class="">
<div class="">
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm" class=""><p class="MsoNormal"><b class=""><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" class="">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" class=""> Sanjay Patel [mailto:<a href="mailto:spatel@rotateright.com" target="_blank" class="">spatel@rotateright.com</a><wbr class="">]
<br class="">
<b class="">Sent:</b> Sunday, January 22, 2017 8:00 PM<br class="">
<b class="">To:</b> Evgeny Astigeevich<br class="">
<b class="">Cc:</b> llvm-dev; nd</span><u class=""></u><u class=""></u></p>
<div class="">
<div class=""><p class="MsoNormal"><br class="">
<b class="">Subject:</b> Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines<u class=""></u><u class=""></u></p>
</div>
</div>
</div>
</div>
<div class="">
<div class=""><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
<div class=""><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">> Do you mean to remove the hack in InstCombiner::visitICmpInst()?</span><u class=""></u><u class=""></u></p><p class="MsoNormal"> <u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Yes. Although (this just came up in D28625 too) we might need to remove multiple versions of that
in order to unlock optimization:</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4338" target="_blank" class="">https://github.com/llvm-<wbr class="">mirror/llvm/blob/master/lib/<wbr class="">Transforms/InstCombine/<wbr class="">InstCombineCompares.cpp#L4338</a></span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCasts.cpp#L470" target="_blank" class="">https://github.com/llvm-<wbr class="">mirror/llvm/blob/master/lib/<wbr class="">Transforms/InstCombine/<wbr class="">InstCombineCasts.cpp#L470</a></span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstructionCombining.cpp#L803" target="_blank" class="">https://github.com/llvm-<wbr class="">mirror/llvm/blob/master/lib/<wbr class="">Transforms/InstCombine/<wbr class="">InstructionCombining.cpp#L803</a></span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp#L409" target="_blank" class="">https://github.com/llvm-<wbr class="">mirror/llvm/blob/master/lib/<wbr class="">Transforms/InstCombine/<wbr class="">InstCombineSimplifyDemanded.<wbr class="">cpp#L409</a></span><u class=""></u><u class=""></u></p>
<div class=""><p class="MsoNormal" style="margin-bottom:12.0pt"><br class="">
Similar for FP:<br class="">
<span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4780" target="_blank" class="">https://github.com/llvm-<wbr class="">mirror/llvm/blob/master/lib/<wbr class="">Transforms/InstCombine/<wbr class="">InstCombineCompares.cpp#L4780</a></span><br class="">
<span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""><a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCasts.cpp#L1376" target="_blank" class="">https://github.com/llvm-<wbr class="">mirror/llvm/blob/master/lib/<wbr class="">Transforms/InstCombine/<wbr class="">InstCombineCasts.cpp#L1376</a></span><u class=""></u><u class=""></u></p>
</div>
<div class=""><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
<div class=""><p class="MsoNormal">On Sun, Jan 22, 2017 at 12:40 PM, Evgeny Astigeevich <<a href="mailto:Evgeny.Astigeevich@arm.com" target="_blank" class="">Evgeny.Astigeevich@arm.com</a>> wrote:<u class=""></u><u class=""></u></p>
<div class="">
<div class=""><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Hi Sanjay,</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">The benchmark source file:
<a href="http://www.llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Shootout/sieve.c?view=markup" target="_blank" class="">
http://www.llvm.org/viewvc/<wbr class="">llvm-project/test-suite/trunk/<wbr class="">SingleSource/Benchmarks/<wbr class="">Shootout/sieve.c?view=markup</a></span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Clang options used to produce the initial IR: clang -DNDEBUG -O3 -DNDEBUG -mcpu=cortex-a53 -fomit-frame-pointer
-O3 -DNDEBUG -w -Werror=date-time -c sieve.c -S -emit-llvm -mllvm -disable-llvm-optzns --target=aarch64-arm-linux</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Opt options: opt -O3 -o /dev/null -print-before-all -print-after-all sieve.ll >& sieve.log</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">I used the IR (in attached sieve.zip) created with the r292487 version.</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">The attached sieve contains the output of ‘-print-before-all -print-after-all’ for r292487 and rL292492.</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">>
</span>If it's possible, can you remove that check locally, rebuild,<u class=""></u><u class=""></u></p><p class="MsoNormal">> and try the benchmark again on your system? I'd love to know<u class=""></u><u class=""></u></p><p class="MsoNormal">> if that change alone would solve the problem.<u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Do you mean to remove the hack in InstCombiner::visitICmpInst()?</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p><p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class="">Kind regards,<br class="">
Evgeny Astigeevich<br class="">
Senior Compiler Engineer<br class="">
Compilation Tools<br class="">
ARM</span><u class=""></u><u class=""></u></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d" class=""> </span><u class=""></u><u class=""></u></p>
<div style="border:none;border-left:solid windowtext 1.5pt;padding:0cm 0cm 0cm 4.0pt;border-color:-moz-use-text-color -moz-use-text-color -moz-use-text-color blue" class="">
<div class="">
<div style="border:none;border-top:solid windowtext 1.0pt;padding:3.0pt 0cm 0cm 0cm;border-color:-moz-use-text-color -moz-use-text-color" class=""><p class="MsoNormal"><b class=""><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" class="">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"" class=""> Sanjay Patel [mailto:<a href="mailto:spatel@rotateright.com" target="_blank" class="">spatel@rotateright.com</a><wbr class="">]
<br class="">
<b class="">Sent:</b> Friday, January 20, 2017 6:16 PM<br class="">
<b class="">To:</b> Evgeny Astigeevich<br class="">
<b class="">Cc:</b> llvm-dev; Renato Golin; <a href="mailto:t.p.northover@gmail.com" target="_blank" class="">
t.p.northover@gmail.com</a>; <a href="mailto:hfinkel@anl.gov" target="_blank" class="">hfinkel@anl.gov</a><br class="">
<b class="">Subject:</b> Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines</span><u class=""></u><u class=""></u></p>
</div>
</div>
<div class="">
<div class=""><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class=""><p class="MsoNormal">Thanks for letting me know about this problem!<br class="">
<br class="">
There's no 'shl nsw' visible in the earlier (r292487) code, so it would be better to see exactly what the IR looks like before that added transform fires.<u class=""></u><u class=""></u></p>
</div><p class="MsoNormal" style="margin-bottom:12.0pt"><br class="">
But I see a red flag:<br class="">
%smax = select i1 %11, i64 %10, i64 8193<u class=""></u><u class=""></u></p>
</div><p class="MsoNormal" style="margin-bottom:12.0pt">The new icmp transform allowed us to create an smax, but we have this hack in InstCombiner::visitICmpInst():<br class="">
<br class="">
// Test if the ICmpInst instruction is used exclusively by a select as<br class="">
// part of a minimum or maximum operation. If so, refrain from doing<br class="">
// any other folding. This helps out other analyses which understand<br class="">
// non-obfuscated minimum and maximum idioms, such as ScalarEvolution<br class="">
// and CodeGen. And in this case, at least one of the comparison<br class="">
// operands has at least one user besides the compare (the select),<br class="">
// which would often largely negate the benefit of folding anyway.<u class=""></u><u class=""></u></p>
</div><p class="MsoNormal" style="margin-bottom:12.0pt">...so that prevented folding the icmp into the earlier math.<u class=""></u><u class=""></u></p>
</div><p class="MsoNormal">I am actively working on trying to get rid of that bail-out by improving min/max value tracking and icmp/select folding. In fact, we might be able to remove it right now, but I
don't know the history of that code or what cases it was supposed to help.<u class=""></u><u class=""></u></p>
</div>
</div>
</div>
<div class="">
<div class="">
<div class="">
<div class=""><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
</div>
<div class=""><p class="MsoNormal">If it's possible, can you remove that check locally, rebuild, and try the benchmark again on your system? I'd love to know if that change alone would solve the problem.<u class=""></u><u class=""></u></p>
</div>
<div class=""><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
<div class="">
<div class="">
<div class="">
<div class="">
<div class=""><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
<div class=""><p class="MsoNormal">On Fri, Jan 20, 2017 at 10:11 AM, Evgeny Astigeevich <<a href="mailto:Evgeny.Astigeevich@arm.com" target="_blank" class="">Evgeny.Astigeevich@arm.com</a>> wrote:<u class=""></u><u class=""></u></p><p class="MsoNormal" style="margin-bottom:12.0pt">Hi,<br class="">
<br class="">
We found that today's 17.30%/11.37% performance regressions in LNT SingleSource/Benchmarks/<wbr class="">Shootout/sieve on LNT-AArch64-A53-O3__clang_DEV_<wbr class="">_aarch64 and LNT-Thumb2v7-A15-O3__clang_<wbr class="">DEV__thumbv7 (<a href="http://llvm.org/perf/db_default/v4/nts/daily_report/2017/1/20?filter-machine-regex=aarch64%7Carm%7Cthumb%7Cgreen" target="_blank" class="">http://llvm.org/perf/db_<wbr class="">default/v4/nts/daily_report/<wbr class="">2017/1/20?filter-machine-<wbr class="">regex=aarch64%7Carm%7Cthumb%<wbr class="">7Cgreen</a>)
are caused by changes [rL292492] in InstCombine:<br class="">
<br class="">
<a href="https://reviews.llvm.org/D28406" target="_blank" class="">https://reviews.llvm.org/<wbr class="">D28406</a> "[InstCombine] icmp sgt (shl nsw X, C1), C0 --> icmp sgt X, C0 >> C1"<br class="">
<br class="">
The Loop Vectorizer generates code with more instructions:<br class="">
<br class="">
==== Loop Vectorizer from rL292492 ====<br class="">
for.body5: ; preds = %for.inc16.for.body5_crit_<wbr class="">edge, %for.cond.preheader<br class="">
%indvar = phi i64 [ %indvar.next, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ]<br class="">
%1 = phi i8 [ %.pre, %for.inc16.for.body5_crit_edge ], [ 1, %for.cond.preheader ]<br class="">
%count.122 = phi i32 [ %count.2, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ]<br class="">
%i.119 = phi i64 [ %inc17, %for.inc16.for.body5_crit_edge ], [ 2, %for.cond.preheader ]<br class="">
%2 = add i64 %indvar, 2<br class="">
%3 = shl i64 %indvar, 1<br class="">
%4 = add i64 %3, 4<br class="">
%5 = add i64 %indvar, 2<br class="">
%6 = shl i64 %indvar, 1<br class="">
%7 = add i64 %6, 4<br class="">
%8 = add i64 %indvar, 2<br class="">
%9 = mul i64 %indvar, 3<br class="">
%10 = add i64 %9, 6<br class="">
%11 = icmp sgt i64 %10, 8193<br class="">
%smax = select i1 %11, i64 %10, i64 8193<br class="">
%12 = mul i64 %indvar, -2<br class="">
%13 = add i64 %12, -5<br class="">
%14 = add i64 %smax, %13<br class="">
%15 = add i64 %indvar, 2<br class="">
%16 = udiv i64 %14, %15<br class="">
%17 = add i64 %16, 1<br class="">
%tobool7 = icmp eq i8 %1, 0<br class="">
br i1 %tobool7, label %for.inc16, label %if.then<br class="">
==============================<wbr class="">==<br class="">
<br class="">
The code generated by the Loop Vectorizer before the changes:<br class="">
<br class="">
==== Loop Vectorizer from rL292487 ====<br class="">
for.body5: ; preds = %for.inc16.for.body5_crit_<wbr class="">edge, %for.cond.preheader<br class="">
%indvar = phi i64 [ %indvar.next, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ]<br class="">
%1 = phi i8 [ %.pre, %for.inc16.for.body5_crit_edge ], [ 1, %for.cond.preheader ]<br class="">
%count.122 = phi i32 [ %count.2, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ]<br class="">
%i.119 = phi i64 [ %inc17, %for.inc16.for.body5_crit_edge ], [ 2, %for.cond.preheader ]<br class="">
%2 = add i64 %indvar, 2<br class="">
%3 = shl i64 %indvar, 1<br class="">
%4 = add i64 %3, 4<br class="">
%5 = add i64 %indvar, 2<br class="">
%6 = shl i64 %indvar, 1<br class="">
%7 = add i64 %6, 4<br class="">
%8 = add i64 %indvar, 2<br class="">
%9 = mul i64 %indvar, -2<br class="">
%10 = add i64 %9, 8188<br class="">
%11 = add i64 %indvar, 2<br class="">
%12 = udiv i64 %10, %11<br class="">
%13 = add i64 %12, 1<br class="">
%tobool7 = icmp eq i8 %1, 0<br class="">
br i1 %tobool7, label %for.inc16, label %if.then<br class="">
==============================<wbr class="">==<br class="">
<br class="">
I have not investigated yet why the behaviour of the Vectorizer is changed.<br class="">
<br class="">
Kind regards,<br class="">
Evgeny Astigeevich<br class="">
Senior Compiler Engineer<br class="">
Compilation Tools<br class="">
ARM<u class=""></u><u class=""></u></p>
</div><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div><p class="MsoNormal"> <u class=""></u><u class=""></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div><p class="MsoNormal"><u class=""></u> <u class=""></u></p>
</div>
</div></div></div>
</div>
</div>
</blockquote></div><br class=""></div>
_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<br class=""></div></blockquote></div><br class=""></body></html>