<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Sanjoy,<div class=""><br class=""></div><div class="">This change introduced (or exposed) some assertion failures. Could you please take a look? I’ll try too, but you probably could find the rootcause faster.</div><div class=""><br class=""></div><div class="">Reproducer:</div><div class=""><span style="font-family: Menlo;" class="">> cat fail.c</span></div><div class=""><font face="Menlo" class="">typedef unsigned long long uint64_t;</font></div><div class=""><font face="Menlo" class="">uint64_t foo(unsigned char *a, unsigned l) {</font></div><div class=""><font face="Menlo" class=""> uint64_t Val = 0;</font></div><div class=""><font face="Menlo" class=""> for (unsigned i = 0; i != l; ++i)</font></div><div class=""><font face="Menlo" class=""> Val |= (uint64_t)a[i] << i*8;</font></div><div class=""><font face="Menlo" class=""> return Val;</font></div><div class=""><font face="Menlo" class="">}</font></div><div class=""><font face="Menlo" class=""><br class=""></font></div><div class=""><font face="Menlo" class="">> clang -target i686 -march=yonah -O3 -c -gdwarf-2 fail.c</font></div><div class=""><div class=""><font face="Menlo" class="">Assertion failed: (Amt && "Expected zero shifts to be already optimized away."), function ExpandShiftByConstant, file …llvm.git/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp, line 1420.</font></div><div class=""><font face="Menlo" class="">0 clang-3.7 0x000000010d6b2e0e llvm::sys::PrintStackTrace(__sFILE*) + 46</font></div><div class=""><font face="Menlo" class="">1 clang-3.7 0x000000010d6b41cb PrintStackTraceSignalHandler(void*) + 27</font></div><div class=""><font face="Menlo" class="">2 clang-3.7 0x000000010d6b4615 SignalHandler(int) + 565</font></div><div class=""><font face="Menlo" class="">3 libsystem_platform.dylib 0x00007fff95515f1a _sigtramp + 26</font></div><div class=""><font face="Menlo" class="">4 clang-3.7 0x0000000110668d68 guard variable for isAllowedIDChar(unsigned int, clang::LangOptions const&)::C99AllowedIDChars + 97488</font></div><div class=""><font face="Menlo" class="">5 clang-3.7 0x000000010d6b41fb raise + 27</font></div><div class=""><font face="Menlo" class="">6 clang-3.7 0x000000010d6b42b2 abort + 18</font></div><div class=""><font face="Menlo" class="">7 clang-3.7 0x000000010d6b4291 __assert_rtn + 129</font></div><div class=""><font face="Menlo" class="">8 clang-3.7 0x000000010dd3e943 llvm::DAGTypeLegalizer::ExpandShiftByConstant(llvm::SDNode*, unsigned int, llvm::SDValue&, llvm::SDValue&) + 163</font></div><div class=""><font face="Menlo" class="">9 clang-3.7 0x000000010dd3b550 llvm::DAGTypeLegalizer::ExpandIntRes_Shift(llvm::SDNode*, llvm::SDValue&, llvm::SDValue&) + 240</font></div><div class=""><font face="Menlo" class="">10 clang-3.7 0x000000010dd311a5 llvm::DAGTypeLegalizer::ExpandIntegerResult(llvm::SDNode*, unsigned int) + 3221</font></div></div><div class=""><br class=""></div><div class="">It works for r229730 and fails for r229731.</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Michael</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 18, 2015, at 11:32 AM, Sanjoy Das <<a href="mailto:sanjoy@playingwithpointers.com" class="">sanjoy@playingwithpointers.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">Author: sanjoy<br class="">Date: Wed Feb 18 13:32:25 2015<br class="">New Revision: 229731<br class=""><br class="">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=229731&view=rev" class="">http://llvm.org/viewvc/llvm-project?rev=229731&view=rev</a><br class="">Log:<br class="">Partial fix for bug 22589<br class=""><br class="">Don't spend the entire iteration space in the scalar loop prologue if<br class="">computing the trip count overflows. This change also gets rid of the<br class="">backedge check in the prologue loop and the extra check for<br class="">overflowing trip-count.<br class=""><br class="">Differential Revision: <a href="http://reviews.llvm.org/D7715" class="">http://reviews.llvm.org/D7715</a><br class=""><br class=""><br class="">Modified:<br class=""> llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp<br class=""> llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll<br class=""> llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll<br class=""> llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll<br class=""><br class="">Modified: llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp?rev=229731&r1=229730&r2=229731&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp?rev=229731&r1=229730&r2=229731&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp (original)<br class="">+++ llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp Wed Feb 18 13:32:25 2015<br class="">@@ -58,7 +58,7 @@ STATISTIC(NumRuntimeUnrolled,<br class=""> /// - Branch around the original loop if the trip count is less<br class=""> /// than the unroll factor.<br class=""> ///<br class="">-static void ConnectProlog(Loop *L, Value *TripCount, unsigned Count,<br class="">+static void ConnectProlog(Loop *L, Value *BECount, unsigned Count,<br class=""> BasicBlock *LastPrologBB, BasicBlock *PrologEnd,<br class=""> BasicBlock *OrigPH, BasicBlock *NewPH,<br class=""> ValueToValueMapTy &VMap, AliasAnalysis *AA,<br class="">@@ -109,12 +109,19 @@ static void ConnectProlog(Loop *L, Value<br class=""> }<br class=""> }<br class=""><br class="">- // Create a branch around the orignal loop, which is taken if the<br class="">- // trip count is less than the unroll factor.<br class="">+ // Create a branch around the orignal loop, which is taken if there are no<br class="">+ // iterations remaining to be executed after running the prologue.<br class=""> Instruction *InsertPt = PrologEnd->getTerminator();<br class="">+<br class="">+ assert(Count != 0 && "nonsensical Count!");<br class="">+<br class="">+ // If BECount <u (Count - 1) then (BECount + 1) & (Count - 1) == (BECount + 1)<br class="">+ // (since Count is a power of 2). This means %xtraiter is (BECount + 1) and<br class="">+ // and all of the iterations of this loop were executed by the prologue. Note<br class="">+ // that if BECount <u (Count - 1) then (BECount + 1) cannot unsigned-overflow.<br class=""> Instruction *BrLoopExit =<br class="">- new ICmpInst(InsertPt, ICmpInst::ICMP_ULT, TripCount,<br class="">- ConstantInt::get(TripCount->getType(), Count));<br class="">+ new ICmpInst(InsertPt, ICmpInst::ICMP_ULT, BECount,<br class="">+ ConstantInt::get(BECount->getType(), Count - 1));<br class=""> BasicBlock *Exit = L->getUniqueExitBlock();<br class=""> assert(Exit && "Loop must have a single exit block only");<br class=""> // Split the exit to maintain loop canonicalization guarantees<br class="">@@ -291,23 +298,28 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""><br class=""> // Only unroll loops with a computable trip count and the trip count needs<br class=""> // to be an int value (allowing a pointer type is a TODO item)<br class="">- const SCEV *BECount = SE->getBackedgeTakenCount(L);<br class="">- if (isa<SCEVCouldNotCompute>(BECount) || !BECount->getType()->isIntegerTy())<br class="">+ const SCEV *BECountSC = SE->getBackedgeTakenCount(L);<br class="">+ if (isa<SCEVCouldNotCompute>(BECountSC) ||<br class="">+ !BECountSC->getType()->isIntegerTy())<br class=""> return false;<br class=""><br class="">- // If BECount is INT_MAX, we can't compute trip-count without overflow.<br class="">- if (BECount->isAllOnesValue())<br class="">- return false;<br class="">+ unsigned BEWidth = cast<IntegerType>(BECountSC->getType())->getBitWidth();<br class=""><br class=""> // Add 1 since the backedge count doesn't include the first loop iteration<br class=""> const SCEV *TripCountSC =<br class="">- SE->getAddExpr(BECount, SE->getConstant(BECount->getType(), 1));<br class="">+ SE->getAddExpr(BECountSC, SE->getConstant(BECountSC->getType(), 1));<br class=""> if (isa<SCEVCouldNotCompute>(TripCountSC))<br class=""> return false;<br class=""><br class=""> // We only handle cases when the unroll factor is a power of 2.<br class=""> // Count is the loop unroll factor, the number of extra copies added + 1.<br class="">- if ((Count & (Count-1)) != 0)<br class="">+ if (!isPowerOf2_32(Count))<br class="">+ return false;<br class="">+<br class="">+ // This constraint lets us deal with an overflowing trip count easily; see the<br class="">+ // comment on ModVal below. This check is equivalent to `Log2(Count) <<br class="">+ // BEWidth`.<br class="">+ if (static_cast<uint64_t>(Count) > (1ULL << BEWidth))<br class=""> return false;<br class=""><br class=""> // If this loop is nested, then the loop unroller changes the code in<br class="">@@ -333,16 +345,23 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""> SCEVExpander Expander(*SE, "loop-unroll");<br class=""> Value *TripCount = Expander.expandCodeFor(TripCountSC, TripCountSC->getType(),<br class=""> PreHeaderBR);<br class="">+ Value *BECount = Expander.expandCodeFor(BECountSC, BECountSC->getType(),<br class="">+ PreHeaderBR);<br class=""><br class=""> IRBuilder<> B(PreHeaderBR);<br class=""> Value *ModVal = B.CreateAnd(TripCount, Count - 1, "xtraiter");<br class=""><br class="">- // Check if for no extra iterations, then jump to cloned/unrolled loop.<br class="">- // We have to check that the trip count computation didn't overflow when<br class="">- // adding one to the backedge taken count.<br class="">- Value *LCmp = B.CreateIsNotNull(ModVal, "lcmp.mod");<br class="">- Value *OverflowCheck = B.CreateIsNull(TripCount, "lcmp.overflow");<br class="">- Value *BranchVal = B.CreateOr(OverflowCheck, LCmp, "lcmp.or");<br class="">+ // If ModVal is zero, we know that either<br class="">+ // 1. there are no iteration to be run in the prologue loop<br class="">+ // OR<br class="">+ // 2. the addition computing TripCount overflowed<br class="">+ //<br class="">+ // If (2) is true, we know that TripCount really is (1 << BEWidth) and so the<br class="">+ // number of iterations that remain to be run in the original loop is a<br class="">+ // multiple Count == (1 << Log2(Count)) because Log2(Count) <= BEWidth (we<br class="">+ // explicitly check this above).<br class="">+<br class="">+ Value *BranchVal = B.CreateIsNotNull(ModVal, "lcmp.mod");<br class=""><br class=""> // Branch to either the extra iterations or the cloned/unrolled loop<br class=""> // We will fix up the true branch label when adding loop body copies<br class="">@@ -365,10 +384,7 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""> std::vector<BasicBlock *> NewBlocks;<br class=""> ValueToValueMapTy VMap;<br class=""><br class="">- // If unroll count is 2 and we can't overflow in tripcount computation (which<br class="">- // is BECount + 1), then we don't need a loop for prologue, and we can unroll<br class="">- // it. We can be sure that we don't overflow only if tripcount is a constant.<br class="">- bool UnrollPrologue = (Count == 2 && isa<ConstantInt>(TripCount));<br class="">+ bool UnrollPrologue = Count == 2;<br class=""><br class=""> // Clone all the basic blocks in the loop. If Count is 2, we don't clone<br class=""> // the loop, otherwise we create a cloned loop to execute the extra<br class="">@@ -394,7 +410,7 @@ bool llvm::UnrollRuntimeLoopProlog(Loop<br class=""> // Connect the prolog code to the original loop and update the<br class=""> // PHI functions.<br class=""> BasicBlock *LastLoopBB = cast<BasicBlock>(VMap[Latch]);<br class="">- ConnectProlog(L, TripCount, Count, LastLoopBB, PEnd, PH, NewPH, VMap,<br class="">+ ConnectProlog(L, BECount, Count, LastLoopBB, PEnd, PH, NewPH, VMap,<br class=""> /*AliasAnalysis*/ nullptr, DT, LI, LPM->getAsPass());<br class=""> NumRuntimeUnrolled++;<br class=""> return true;<br class=""><br class="">Modified: llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll?rev=229731&r1=229730&r2=229731&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll?rev=229731&r1=229730&r2=229731&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll (original)<br class="">+++ llvm/trunk/test/Transforms/LoopUnroll/runtime-loop.ll Wed Feb 18 13:32:25 2015<br class="">@@ -4,9 +4,7 @@<br class=""><br class=""> ; CHECK: %xtraiter = and i32 %n<br class=""> ; CHECK: %lcmp.mod = icmp ne i32 %xtraiter, 0<br class="">-; CHECK: %lcmp.overflow = icmp eq i32 %n, 0<br class="">-; CHECK: %lcmp.or = or i1 %lcmp.overflow, %lcmp.mod<br class="">-; CHECK: br i1 %lcmp.or, label %for.body.prol, label %for.body.preheader.split<br class="">+; CHECK: br i1 %lcmp.mod, label %for.body.prol, label %for.body.preheader.split<br class=""><br class=""> ; CHECK: for.body.prol:<br class=""> ; CHECK: %indvars.iv.prol = phi i64 [ %indvars.iv.next.prol, %for.body.prol ], [ 0, %for.body.preheader ]<br class=""><br class="">Modified: llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll?rev=229731&r1=229730&r2=229731&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll?rev=229731&r1=229730&r2=229731&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll (original)<br class="">+++ llvm/trunk/test/Transforms/LoopUnroll/runtime-loop1.ll Wed Feb 18 13:32:25 2015<br class="">@@ -3,7 +3,7 @@<br class=""> ; This tests that setting the unroll count works<br class=""><br class=""> ; CHECK: for.body.prol:<br class="">-; CHECK: br i1 %prol.iter.cmp, label %for.body.prol, label %for.body.preheader.split<br class="">+; CHECK: br label %for.body.preheader.split<br class=""> ; CHECK: for.body:<br class=""> ; CHECK: br i1 %exitcond.1, label %for.end.loopexit.unr-lcssa, label %for.body<br class=""> ; CHECK-NOT: br i1 %exitcond.4, label %for.end.loopexit{{.*}}, label %for.body<br class=""><br class="">Modified: llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll?rev=229731&r1=229730&r2=229731&view=diff" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll?rev=229731&r1=229730&r2=229731&view=diff</a><br class="">==============================================================================<br class="">--- llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll (original)<br class="">+++ llvm/trunk/test/Transforms/LoopUnroll/tripcount-overflow.ll Wed Feb 18 13:32:25 2015<br class="">@@ -1,19 +1,28 @@<br class=""> ; RUN: opt < %s -S -unroll-runtime -unroll-count=2 -loop-unroll | FileCheck %s<br class=""> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"<br class=""><br class="">-; When prologue is fully unrolled, the branch on its end is unconditional.<br class="">-; Unrolling it is illegal if we can't prove that trip-count+1 doesn't overflow,<br class="">-; like in this example, where it comes from an argument.<br class="">-;<br class="">-; This test is based on an example from here:<br class="">-; <a href="http://stackoverflow.com/questions/23838661/why-is-clang-optimizing-this-code-out" class="">http://stackoverflow.com/questions/23838661/why-is-clang-optimizing-this-code-out</a><br class="">-;<br class="">+; This test case documents how runtime loop unrolling handles the case<br class="">+; when the backedge-count is -1.<br class="">+<br class="">+; If %N, the backedge-taken count, is -1 then %0 unsigned-overflows<br class="">+; and is 0. %xtraiter too is 0, signifying that the total trip-count<br class="">+; is divisible by 2. The prologue then branches to the unrolled loop<br class="">+; and executes the 2^32 iterations there, in groups of 2.<br class="">+<br class="">+<br class="">+; CHECK: entry:<br class="">+; CHECK-NEXT: %0 = add i32 %N, 1<br class="">+; CHECK-NEXT: %xtraiter = and i32 %0, 1<br class="">+; CHECK-NEXT: %lcmp.mod = icmp ne i32 %xtraiter, 0<br class="">+; CHECK-NEXT: br i1 %lcmp.mod, label %while.body.prol, label %entry.split<br class="">+<br class=""> ; CHECK: while.body.prol:<br class="">-; CHECK: br i1<br class="">+; CHECK: br label %entry.split<br class="">+<br class=""> ; CHECK: entry.split:<br class=""><br class=""> ; Function Attrs: nounwind readnone ssp uwtable<br class="">-define i32 @foo(i32 %N) #0 {<br class="">+define i32 @foo(i32 %N) {<br class=""> entry:<br class=""> br label %while.body<br class=""><br class="">@@ -26,5 +35,3 @@ while.body:<br class=""> while.end: ; preds = %while.body<br class=""> ret i32 %i<br class=""> }<br class="">-<br class="">-attributes #0 = { nounwind readnone ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }<br class=""><br class=""><br class="">_______________________________________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@cs.uiuc.edu" class="">llvm-commits@cs.uiuc.edu</a><br class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br class=""></div></blockquote></div><br class=""></div></body></html>