[llvm] r233881 - Fix a bug indicated by -fsanitize=shift-exponent.

Fri Apr 3 20:22:29 PDT 2015

Ugh, looks like for BEWidth=64, this code was "working" purely by
accident.  Sorry for fat-fingering the comment!

The most straightforward way to fix this is to just test for what it
is supposed to test -- Justin is it possible for you to check if the
following patch fixes the performance problem?

diff --git a/lib/Transforms/Utils/LoopUnrollRuntime.cpp
b/lib/Transforms/Utils/LoopUnrollRuntime.cpp
index 46570a1..c8d4782 100644
--- a/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+++ b/lib/Transforms/Utils/LoopUnrollRuntime.cpp
@@ -318,9 +318,8 @@ bool llvm::UnrollRuntimeLoopProlog(Loop *L,
unsigned Count, LoopInfo *LI,
     return false;

   // This constraint lets us deal with an overflowing trip count
easily; see the
-  // comment on ModVal below.  This check is equivalent to `Log2(Count) <
-  // BEWidth`.
-  if (BEWidth < 64 && static_cast<uint64_t>(Count) > (1ULL << BEWidth))
+  // comment on ModVal below.
+  if (Log2_32(Count) > BEWidth)
     return false;

   // If this loop is nested, then the loop unroller changes the code in

On Fri, Apr 3, 2015 at 5:58 PM, Alexey Samsonov <vonosmas at gmail.com> wrote:
> +Sanjoy who probably knows more about this code.
>
> On Fri, Apr 3, 2015 at 5:57 PM, Alexey Samsonov <vonosmas at gmail.com> wrote:
>>
>>
>> On Fri, Apr 3, 2015 at 5:46 PM, Justin Bogner <mail at justinbogner.com>
>> wrote:
>>>
>>> Alexey Samsonov <vonosmas at gmail.com> writes:
>>> > Hi Justin,
>>> >
>>> > On Fri, Apr 3, 2015 at 5:05 PM, Justin Bogner <mail at justinbogner.com>
>>> > wrote:
>>> >
>>> >     Alexey Samsonov <vonosmas at gmail.com> writes:
>>> >     > Author: samsonov
>>> >     > Date: Wed Apr  1 20:30:10 2015
>>> >     > New Revision: 233881
>>> >     >
>>> >     > URL: http://llvm.org/viewvc/llvm-project?rev=233881&view=rev
>>> >     > Log:
>>> >     > Fix a bug indicated by -fsanitize=shift-exponent.
>>> >
>>> >     I noticed a significant performance regression
>>> > Benchmarks/Shootout/sieve
>>> >     after this change. I suspect this isn't quite the right fix for the
>>> >     undefined behaviour.
>>> >
>>> >     You can see the performance jump in lnt here:
>>> >
>>> >
>>> > http://llvm-lnt.herokuapp.com/db_default/v4/nts/graph?plot.0=3.794.3&
>>> >     highlight_run=9976
>>> >
>>> >     There's a big regression between r233879 and r233882. This is the
>>> > only
>>> >     interesting change in that range.
>>> >
>>> >     > Modified:
>>> >     >     llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp
>>> >     >
>>> >     > Modified: llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp
>>> >     > URL:
>>> >     >
>>> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/
>>> >     LoopUnrollRuntime.cpp?rev=233881&r1=233880&r2=233881&view=diff
>>> >     >
>>> > ========================================================================
>>> >     ======
>>> >     > --- llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp
>>> > (original)
>>> >     > +++ llvm/trunk/lib/Transforms/Utils/LoopUnrollRuntime.cpp Wed Apr
>>> > 1
>>> >     20:30:10 2015
>>> >     > @@ -320,7 +320,7 @@ bool llvm::UnrollRuntimeLoopProlog(Loop
>>> >     >    // This constraint lets us deal with an overflowing trip count
>>> >     easily; see the
>>> >     >    // comment on ModVal below.  This check is equivalent to
>>> > `Log2(Count)
>>> >     <
>>> >     >    // BEWidth`.
>>> >     > -  if (static_cast<uint64_t>(Count) > (1ULL << BEWidth))
>>> >     > +  if (BEWidth < 64 && static_cast<uint64_t>(Count) > (1ULL <<
>>> > BEWidth))
>>> >
>>> >     To match what the comment says we're doing, I guess we want:
>>> >
>>> >       if (BEWidth >= 64 || static_cast<uint64_t>(Count) > (1ULL <<
>>> > BEWidth))
>>> >
>>> >     since Log2(Count) is guaranteed to be less than 64.
>>> >
>>> > ... but if BEWidth is 64, than Count can never be greater than 1ULL <<
>>> > BEWidth, and we should *not* return false.
>>> > Note that the comment doesn't match the current code.
>>>
>>> Oh, the code says greater, where the comment says less. Strange. I
>>> certainly meant to use less in my suggested change, but I misread and
>>> thought the current code was using less as well.
>>>
>>> In any case, it seems whatever we ended up doing when we hit the
>>> undefined behaviour generated faster code for the benchmark ;)
>>
>>
>> Heh, if BEWidth is 64 than (1ULL << BEWidth) will overflow and will likely
>> be just 1
>> (at least that's how gcc and clang behave on my machine), and we will
>> break from the
>> function and return false, instead of doing the actual unrolling below.
>>
>>>
>>>
>>> >
>>> >     >      return false;
>>> >     >
>>> >     >    // If this loop is nested, then the loop unroller changes the
>>> > code in
>>> >     >
>>> >     >
>>> >     > _______________________________________________
>>> >     > llvm-commits mailing list
>>> >     > llvm-commits at cs.uiuc.edu
>>> >     > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> >
>>> > --
>>> > Alexey Samsonov
>>> > vonosmas at gmail.com
>>
>>
>>
>>
>> --
>> Alexey Samsonov
>> vonosmas at gmail.com
>
>
>
>
> --
> Alexey Samsonov
> vonosmas at gmail.com