<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">“ExpandShiftByConstant” was written with the assumption that zero shifts would always be optimized away. The code doesn’t check for it nor does it produce correct code when that happens. I added the assert to make sure we catch this case in the future when someone violates this implicit contract/assumption of this method.<div class=""><br class=""></div><div class="">—Juergen<br class=""><div class=""><div class=""><br class=""><div class=""><div><blockquote type="cite" class=""><div class="">On Mar 5, 2015, at 11:24 AM, Sanjoy Das <<a href="mailto:sanjoy@playingwithpointers.com" class="">sanjoy@playingwithpointers.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">+CC Juergen since he added the assert</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">This looks like a codegen issue.  We end up with a SHL with an Amt of</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">zero and trip an assert on that.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> $ ./bin/llc -mcpu=yonah fail.ll</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">will reproduce the crash.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I don't have the context to debug this though -- why is the assert</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">expected to not trip?  What is the invariant that transforms over and</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">construction of SDNodes are supposed to preserve?</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">AFAICT, in the current case, the shift amount starts off being a</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">EXTRACT_VECTOR_ELT of a BUILD_VECTOR that gets constant folded to 0.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">The reproducing .ll file is attached.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">-- Sanjoy</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">On Wed, Mar 4, 2015 at 9:36 PM, Sanjoy Das</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""><</span><a href="mailto:sanjoy@playingwithpointers.com" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">sanjoy@playingwithpointers.com</a><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">> wrote:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Hi Michael,<br class=""><br class="">I'm looking at this now.<br class=""><br class="">-- Sanjoy<br class=""><br class="">On Wed, Mar 4, 2015 at 9:21 PM, Michael Zolotukhin<br class=""><<a href="mailto:mzolotukhin@apple.com" class="">mzolotukhin@apple.com</a>> wrote:<br class=""><blockquote type="cite" class="">Hi Sanjoy,<br class=""><br class="">This change introduced (or exposed) some assertion failures. Could you<br class="">please take a look? I’ll try too, but you probably could find the rootcause<br class="">faster.<br class=""><br class="">Reproducer:<br class=""><blockquote type="cite" class="">cat fail.c<br class=""></blockquote>typedef unsigned long long uint64_t;<br class="">uint64_t foo(unsigned char *a, unsigned l) {<br class=""> uint64_t Val = 0;<br class=""> for (unsigned i = 0; i != l; ++i)<br class="">   Val |= (uint64_t)a[i] << i*8;<br class=""> return Val;<br class="">}<br class=""><br class=""><blockquote type="cite" class="">clang -target i686 -march=yonah -O3 -c -gdwarf-2 fail.c<br class=""></blockquote>Assertion failed: (Amt && "Expected zero shifts to be already optimized<br class="">away."), function ExpandShiftByConstant, file<br class="">…llvm.git/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp, line 1420.<br class="">0  clang-3.7                0x000000010d6b2e0e<br class="">llvm::sys::PrintStackTrace(__sFILE*) + 46<br class="">1  clang-3.7                0x000000010d6b41cb<br class="">PrintStackTraceSignalHandler(void*) + 27<br class="">2  clang-3.7                0x000000010d6b4615 SignalHandler(int) + 565<br class="">3  libsystem_platform.dylib 0x00007fff95515f1a _sigtramp + 26<br class="">4  clang-3.7                0x0000000110668d68 guard variable for<br class="">isAllowedIDChar(unsigned int, clang::LangOptions const&)::C99AllowedIDChars<br class="">+ 97488<br class="">5  clang-3.7                0x000000010d6b41fb raise + 27<br class="">6  clang-3.7                0x000000010d6b42b2 abort + 18<br class="">7  clang-3.7                0x000000010d6b4291 __assert_rtn + 129<br class="">8  clang-3.7                0x000000010dd3e943<br class="">llvm::DAGTypeLegalizer::ExpandShiftByConstant(llvm::SDNode*, unsigned int,<br class="">llvm::SDValue&, llvm::SDValue&) + 163<br class="">9  clang-3.7                0x000000010dd3b550<br class="">llvm::DAGTypeLegalizer::ExpandIntRes_Shift(llvm::SDNode*, llvm::SDValue&,<br class="">llvm::SDValue&) + 240<br class="">10 clang-3.7                0x000000010dd311a5<br class="">llvm::DAGTypeLegalizer::ExpandIntegerResult(llvm::SDNode*, unsigned int) +<br class="">3221<br class=""><br class="">It works for r229730 and fails for r229731.<br class=""><br class="">Thanks,<br class="">Michael<br class=""><br class="">On Feb 18, 2015, at 11:32 AM, Sanjoy Das <<a href="mailto:sanjoy@playingwithpointers.com" class="">sanjoy@playingwithpointers.com</a>><br class="">wrote:<br class=""><br class="">Author: sanjoy<br class="">Date: Wed Feb 18 13:32:25 2015<br class="">New Revision: 229731<br class=""><br class="">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=229731&view=rev" class="">http://llvm.org/viewvc/llvm-project?rev=229731&view=rev</a><br class="">Log:<br class="">Partial fix for bug 22589<br class=""><br class="">Don't spend the entire iteration space in the scalar loop prologue if<br class="">computing the trip count overflows.  This change also gets rid of the<br class="">backedge check in the prologue loop and the extra check for<br class="">overflowing trip-count.<br class=""><br class="">Differential Revision: <a href="http://reviews.llvm.org/D7715" class="">http://reviews.llvm.org/D7715</a><br class=""><br class=""><br class="">Modified:<br class="">  llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp<br class="">  llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll<br class="">  llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll<br class="">  llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll<br class=""><br class="">Modified: llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp<br class="">URL:<br class=""><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp?rev=229731&r1=229730&r2=229731&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp?rev=229731&r1=229730&r2=229731&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp (original)<br class="">+++ llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp Wed Feb 18<br class="">13:32:25 2015<br class="">@@ -58,7 +58,7 @@ STATISTIC(NumRuntimeUnrolled,<br class="">/// - Branch around the original loop if the trip count is less<br class="">///   than the unroll factor.<br class="">///<br class="">-static void ConnectProlog(Loop *L, Value *TripCount, unsigned Count,<br class="">+static void ConnectProlog(Loop *L, Value *BECount, unsigned Count,<br class="">                         BasicBlock *LastPrologBB, BasicBlock *PrologEnd,<br class="">                         BasicBlock *OrigPH, BasicBlock *NewPH,<br class="">                         ValueToValueMapTy &VMap, AliasAnalysis *AA,<br class="">@@ -109,12 +109,19 @@ static void ConnectProlog(Loop *L, Value<br class="">   }<br class=""> }<br class=""><br class="">-  // Create a branch around the orignal loop, which is taken if the<br class="">-  // trip count is less than the unroll factor.<br class="">+  // Create a branch around the orignal loop, which is taken if there are<br class="">no<br class="">+  // iterations remaining to be executed after running the prologue.<br class=""> Instruction *InsertPt = PrologEnd->getTerminator();<br class="">+<br class="">+  assert(Count != 0 && "nonsensical Count!");<br class="">+<br class="">+  // If BECount <u (Count - 1) then (BECount + 1) & (Count - 1) == (BECount<br class="">+ 1)<br class="">+  // (since Count is a power of 2).  This means %xtraiter is (BECount + 1)<br class="">and<br class="">+  // and all of the iterations of this loop were executed by the prologue.<br class="">Note<br class="">+  // that if BECount <u (Count - 1) then (BECount + 1) cannot<br class="">unsigned-overflow.<br class=""> Instruction *BrLoopExit =<br class="">-    new ICmpInst(InsertPt, ICmpInst::ICMP_ULT, TripCount,<br class="">-                 ConstantInt::get(TripCount->getType(), Count));<br class="">+    new ICmpInst(InsertPt, ICmpInst::ICMP_ULT, BECount,<br class="">+                 ConstantInt::get(BECount->getType(), Count - 1));<br class=""> BasicBlock *Exit = L->getUniqueExitBlock();<br class=""> assert(Exit && "Loop must have a single exit block only");<br class=""> // Split the exit to maintain loop canonicalization guarantees<br class="">@@ -291,23 +298,28 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""><br class=""> // Only unroll loops with a computable trip count and the trip count needs<br class=""> // to be an int value (allowing a pointer type is a TODO item)<br class="">-  const SCEV *BECount = SE->getBackedgeTakenCount(L);<br class="">-  if (isa<SCEVCouldNotCompute>(BECount) ||<br class="">!BECount->getType()->isIntegerTy())<br class="">+  const SCEV *BECountSC = SE->getBackedgeTakenCount(L);<br class="">+  if (isa<SCEVCouldNotCompute>(BECountSC) ||<br class="">+      !BECountSC->getType()->isIntegerTy())<br class="">   return false;<br class=""><br class="">-  // If BECount is INT_MAX, we can't compute trip-count without overflow.<br class="">-  if (BECount->isAllOnesValue())<br class="">-    return false;<br class="">+  unsigned BEWidth =<br class="">cast<IntegerType>(BECountSC->getType())->getBitWidth();<br class=""><br class=""> // Add 1 since the backedge count doesn't include the first loop iteration<br class=""> const SCEV *TripCountSC =<br class="">-    SE->getAddExpr(BECount, SE->getConstant(BECount->getType(), 1));<br class="">+    SE->getAddExpr(BECountSC, SE->getConstant(BECountSC->getType(), 1));<br class=""> if (isa<SCEVCouldNotCompute>(TripCountSC))<br class="">   return false;<br class=""><br class=""> // We only handle cases when the unroll factor is a power of 2.<br class=""> // Count is the loop unroll factor, the number of extra copies added + 1.<br class="">-  if ((Count & (Count-1)) != 0)<br class="">+  if (!isPowerOf2_32(Count))<br class="">+    return false;<br class="">+<br class="">+  // This constraint lets us deal with an overflowing trip count easily;<br class="">see the<br class="">+  // comment on ModVal below.  This check is equivalent to `Log2(Count) <<br class="">+  // BEWidth`.<br class="">+  if (static_cast<uint64_t>(Count) > (1ULL << BEWidth))<br class="">   return false;<br class=""><br class=""> // If this loop is nested, then the loop unroller changes the code in<br class="">@@ -333,16 +345,23 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""> SCEVExpander Expander(*SE, "loop-unroll");<br class=""> Value *TripCount = Expander.expandCodeFor(TripCountSC,<br class="">TripCountSC->getType(),<br class="">                                           PreHeaderBR);<br class="">+  Value *BECount = Expander.expandCodeFor(BECountSC, BECountSC->getType(),<br class="">+                                          PreHeaderBR);<br class=""><br class=""> IRBuilder<> B(PreHeaderBR);<br class=""> Value *ModVal = B.CreateAnd(TripCount, Count - 1, "xtraiter");<br class=""><br class="">-  // Check if for no extra iterations, then jump to cloned/unrolled loop.<br class="">-  // We have to check that the trip count computation didn't overflow when<br class="">-  // adding one to the backedge taken count.<br class="">-  Value *LCmp = B.CreateIsNotNull(ModVal, "lcmp.mod");<br class="">-  Value *OverflowCheck = B.CreateIsNull(TripCount, "lcmp.overflow");<br class="">-  Value *BranchVal = B.CreateOr(OverflowCheck, LCmp, "lcmp.or");<br class="">+  // If ModVal is zero, we know that either<br class="">+  //  1. there are no iteration to be run in the prologue loop<br class="">+  // OR<br class="">+  //  2. the addition computing TripCount overflowed<br class="">+  //<br class="">+  // If (2) is true, we know that TripCount really is (1 << BEWidth) and so<br class="">the<br class="">+  // number of iterations that remain to be run in the original loop is a<br class="">+  // multiple Count == (1 << Log2(Count)) because Log2(Count) <= BEWidth<br class="">(we<br class="">+  // explicitly check this above).<br class="">+<br class="">+  Value *BranchVal = B.CreateIsNotNull(ModVal, "lcmp.mod");<br class=""><br class=""> // Branch to either the extra iterations or the cloned/unrolled loop<br class=""> // We will fix up the true branch label when adding loop body copies<br class="">@@ -365,10 +384,7 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""> std::vector<BasicBlock *> NewBlocks;<br class=""> ValueToValueMapTy VMap;<br class=""><br class="">-  // If unroll count is 2 and we can't overflow in tripcount computation<br class="">(which<br class="">-  // is BECount + 1), then we don't need a loop for prologue, and we can<br class="">unroll<br class="">-  // it. We can be sure that we don't overflow only if tripcount is a<br class="">constant.<br class="">-  bool UnrollPrologue = (Count == 2 && isa<ConstantInt>(TripCount));<br class="">+  bool UnrollPrologue = Count == 2;<br class=""><br class=""> // Clone all the basic blocks in the loop. If Count is 2, we don't clone<br class=""> // the loop, otherwise we create a cloned loop to execute the extra<br class="">@@ -394,7 +410,7 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""> // Connect the prolog code to the original loop and update the<br class=""> // PHI functions.<br class=""> BasicBlock *LastLoopBB = cast<BasicBlock>(VMap[Latch]);<br class="">-  ConnectProlog(L, TripCount, Count, LastLoopBB, PEnd, PH, NewPH, VMap,<br class="">+  ConnectProlog(L, BECount, Count, LastLoopBB, PEnd, PH, NewPH, VMap,<br class="">               /*AliasAnalysis*/ nullptr, DT, LI, LPM->getAsPass());<br class=""> NumRuntimeUnrolled++;<br class=""> return true;<br class=""><br class="">Modified: llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll?rev=229731&r1=229730&r2=229731&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll (original)<br class="">+++ llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll Wed Feb 18<br class="">13:32:25 2015<br class="">@@ -4,9 +4,7 @@<br class=""><br class="">; CHECK: %xtraiter = and i32 %n<br class="">; CHECK:  %lcmp.mod = icmp ne i32 %xtraiter, 0<br class="">-; CHECK:  %lcmp.overflow = icmp eq i32 %n, 0<br class="">-; CHECK:  %lcmp.or = or i1 %lcmp.overflow, %lcmp.mod<br class="">-; CHECK:  br i1 %lcmp.or, label %for.body.prol, label<br class="">%for.body.preheader.split<br class="">+; CHECK:  br i1 %lcmp.mod, label %for.body.prol, label<br class="">%for.body.preheader.split<br class=""><br class="">; CHECK: for.body.prol:<br class="">; CHECK: %indvars.iv.prol = phi i64 [ %indvars.iv.next.prol, %for.body.prol<br class="">], [ 0, %for.body.preheader ]<br class=""><br class="">Modified: llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll?rev=229731&r1=229730&r2=229731&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll (original)<br class="">+++ llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll Wed Feb 18<br class="">13:32:25 2015<br class="">@@ -3,7 +3,7 @@<br class="">; This tests that setting the unroll count works<br class=""><br class="">; CHECK: for.body.prol:<br class="">-; CHECK: br i1 %prol.iter.cmp, label %for.body.prol, label<br class="">%for.body.preheader.split<br class="">+; CHECK: br label %for.body.preheader.split<br class="">; CHECK: for.body:<br class="">; CHECK: br i1 %exitcond.1, label %for.end.loopexit.unr-lcssa, label<br class="">%for.body<br class="">; CHECK-NOT: br i1 %exitcond.4, label %for.end.loopexit{{.*}}, label<br class="">%for.body<br class=""><br class="">Modified: llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll<br class="">URL:<br class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll?rev=229731&r1=229730&r2=229731&view=diff<br class="">==============================================================================<br class="">--- llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll (original)<br class="">+++ llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll Wed Feb 18<br class="">13:32:25 2015<br class="">@@ -1,19 +1,28 @@<br class="">; RUN: opt < %s -S -unroll-runtime -unroll-count=2 -loop-unroll | FileCheck<br class="">%s<br class="">target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"<br class=""><br class="">-; When prologue is fully unrolled, the branch on its end is unconditional.<br class="">-; Unrolling it is illegal if we can't prove that trip-count+1 doesn't<br class="">overflow,<br class="">-; like in this example, where it comes from an argument.<br class="">-;<br class="">-; This test is based on an example from here:<br class="">-;<br class="">http://stackoverflow.com/questions/23838661/why-is-clang-optimizing-this-code-out<br class="">-;<br class="">+; This test case documents how runtime loop unrolling handles the case<br class="">+; when the backedge-count is -1.<br class="">+<br class="">+; If %N, the backedge-taken count, is -1 then %0 unsigned-overflows<br class="">+; and is 0.  %xtraiter too is 0, signifying that the total trip-count<br class="">+; is divisible by 2.  The prologue then branches to the unrolled loop<br class="">+; and executes the 2^32 iterations there, in groups of 2.<br class="">+<br class="">+<br class="">+; CHECK: entry:<br class="">+; CHECK-NEXT: %0 = add i32 %N, 1<br class="">+; CHECK-NEXT: %xtraiter = and i32 %0, 1<br class="">+; CHECK-NEXT: %lcmp.mod = icmp ne i32 %xtraiter, 0<br class="">+; CHECK-NEXT: br i1 %lcmp.mod, label %while.body.prol, label %entry.split<br class="">+<br class="">; CHECK: while.body.prol:<br class="">-; CHECK: br i1<br class="">+; CHECK: br label %entry.split<br class="">+<br class="">; CHECK: entry.split:<br class=""><br class="">; Function Attrs: nounwind readnone ssp uwtable<br class="">-define i32 @foo(i32 %N) #0 {<br class="">+define i32 @foo(i32 %N) {<br class="">entry:<br class=""> br label %while.body<br class=""><br class="">@@ -26,5 +35,3 @@ while.body:<br class="">while.end:                                        ; preds = %while.body<br class=""> ret i32 %i<br class="">}<br class="">-<br class="">-attributes #0 = { nounwind readnone ssp uwtable<br class="">"less-precise-fpmad"="false" "no-frame-pointer-elim"="true"<br class="">"no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false"<br class="">"no-nans-fp-math"="false" "stack-protector-buffer-size"="8"<br class="">"unsafe-fp-math"="false" "use-soft-float"="false" }<br class=""><br class=""><br class="">_______________________________________________<br class="">llvm-commits mailing list<br class="">llvm-commits@cs.uiuc.edu<br class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br class=""><br class=""><br class=""></blockquote></blockquote><span id="cid:FB725E85-C4F5-4AE5-83D8-E2B091D034A1@apple.com"><fail.ll></span></div></blockquote></div><br class=""></div></div></div></div></body></html>