[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

Thu Aug 16 17:23:24 PDT 2018

Well, like I said earlier, it's usually viable to check for a guard, 
along the lines of https://reviews.llvm.org/D28536 (or maybe implement 
some context-sensitive range analysis algorithm to use everywhere).

The alternative is to try to do something to preserve the nowrap flags 
we have on the input to the icmp, in 
ScalarEvolution::computeExitLimitFromICmp.

-Eli

On 8/16/2018 4:54 PM, Alexandre Isoard wrote:
> The loop exits iff {2,+,1}<nuw><%for.body> == (zext i32 %n to i64)
>
> The nuw marking on the "induction variable" should be sufficient to 
> deduce a max loop trip count of 2^32.
> But I do not know how we compute it (we build a database and it is 
> contrived to follow, at least to me).
>
> I saw that we annotate it with <nsw> (which is correct and can be (and 
> probably has been) deduced from the ranges) and following our 
> discussion, we can't annotate it with <nuw> as per limitation of our 
> unification.
>
> So, I am kind of in a bind here...
>
> On Thu, Aug 16, 2018 at 4:34 PM Friedman, Eli <efriedma at codeaurora.org 
> <mailto:efriedma at codeaurora.org>> wrote:
>
>     In general, the backedge-taken count is an unsigned value; it's
>     possible to write a loop with a trip count of 2^64 using a 64-bit
>     induction variable.  To prove your loop has a "small" trip count,
>     you have to use either the guard or the nsw/nuw markings on the
>     induction variable.
>
>     -Eli
>
>     On 8/16/2018 4:09 PM, Alexandre Isoard wrote:
>>     Ok.
>>
>>     To go back to the original issue, would it be meaningful to add a
>>     SCEVUMax(0, BTC) on the final BTC computed by SCEV?
>>
>>     So that it does not use "negative values"?
>>
>>     On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli
>>     <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>> wrote:
>>
>>         On 8/15/2018 2:27 PM, Alexandre Isoard wrote:
>>>         I'm not sure I understand the poison/undef/UB distinctions.
>>>
>>>         But on this example:
>>>
>>>             define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {
>>>             entry:
>>>               %adds = add nsw i32 %x, %y
>>>               %addu = add nuw i32 %x, %y
>>>               %cond = select i1 %b, i32 %adds, i32 %addu
>>>               ret i32 %cond
>>>             }
>>>
>>>
>>>         It is important to not propagate the nsw/nuw between the two
>>>         SCEV expressions (which unification would do today, can I
>>>         consider that a bug or is it a feature?).
>>
>>         It's an intentional design choice.
>>
>>>         So we work-around it by not informing SCEV of the flags:
>>>
>>>             Printing analysis 'Scalar Evolution Analysis' for
>>>             function 'func':
>>>             Classifying expressions for: @func
>>>               %adds = add nsw i32 %x, %y
>>>               -->  (%x + %y) U: full-set S: full-set
>>>               %addu = add nuw i32 %x, %y
>>>               -->  (%x + %y) U: full-set S: full-set
>>>               %cond = select i1 %b, i32 %adds, i32 %addu
>>>               -->  %cond U: full-set S: full-set
>>>             Determining loop execution counts for: @func
>>>
>>>
>>>         Would there be problems if we properly considered nuw/nsw
>>>         flags when unifying SCEVs?
>>
>>         There would be other consequences.  For example, `(%x +
>>         %y)<nsw>` and `(%x + %y)<nuw>` wouldn't compare equal for
>>         other simplifications, and all the places that call
>>         setNoWrapFlags would have to be rewritten.  It's probably
>>         possible to come up with some workable design, but nobody has
>>         actually tried it, so it's not clear how much work it would
>>         be to implement, or whether it would improve the generated
>>         code overall.
>>
>>         -Eli
>>
>>>
>>>         On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli
>>>         <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>>
>>>         wrote:
>>>
>>>             On 8/15/2018 1:31 PM, Alexandre Isoard wrote:
>>>>             Is that why we do not deduce +<nsw> from "add nsw" either?
>>>
>>>             Essentially, yes.
>>>
>>>>             Is that an intrinsic limitation of creating a
>>>>             context-invariant expressions from a Value* or is that
>>>>             a limitation of our implementation (our unification not
>>>>             considering the nsw flags)?
>>>
>>>             It's a consequence of unification not considering nsw. 
>>>             (nsw on an instruction is naturally invariant because
>>>             violating nsw produces poison, not UB.)
>>>
>>>             -Eli
>>>
>>>>
>>>>             On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli
>>>>             <efriedma at codeaurora.org
>>>>             <mailto:efriedma at codeaurora.org>> wrote:
>>>>
>>>>                 On 8/15/2018 12:21 PM, Alexandre Isoard via
>>>>                 llvm-dev wrote:
>>>>>                 Hello,
>>>>>
>>>>>                 If I run clang on the following code:
>>>>>
>>>>>                     void func(unsigned n) {
>>>>>                       for (unsigned long x = 1; x < n; ++x)
>>>>>                         dummy(x);
>>>>>                     }
>>>>>
>>>>>
>>>>>                 I get the following llvm ir:
>>>>>
>>>>>                     define void @func(i32 %n) {
>>>>>                     entry:
>>>>>                       %conv = zext i32 %n to i64
>>>>>                       %cmp5 = icmp ugt i32 %n, 1
>>>>>                       br i1 %cmp5, label %for.body, label
>>>>>                     %for.cond.cleanup
>>>>>                     for.cond.cleanup:                        ;
>>>>>                     preds = %for.body, %entry
>>>>>                       ret void
>>>>>                     for.body:                        ; preds =
>>>>>                     %entry, %for.body
>>>>>                       %x.06 = phi i64 [ %inc, %for.body ], [ 1,
>>>>>                     %entry ]
>>>>>                       tail call void @dummy(i64 %x.06) #2
>>>>>                       %inc = add nuw nsw i64 %x.06, 1
>>>>>                       %exitcond = icmp eq i64 %inc, %conv
>>>>>                       br i1 %exitcond, label %for.cond.cleanup,
>>>>>                     label %for.body
>>>>>                     }
>>>>>
>>>>>
>>>>>                 Over which, SCEV will provide the following analysis:
>>>>>
>>>>>                     Printing analysis 'Scalar Evolution Analysis'
>>>>>                     for function 'func':
>>>>>                     Classifying expressions for: @func
>>>>>                       %conv = zext i32 %n to i64
>>>>>                       -->  (zext i32 %n to i64) U: [0,4294967296)
>>>>>                     S: [0,4294967296)
>>>>>                       %x.06 = phi i64 [ %inc, %for.body ], [ 1,
>>>>>                     %entry ]
>>>>>                       --> {1,+,1}<nuw><nsw><%for.body> U:
>>>>>                     [1,-9223372036854775808) S:
>>>>>                     [1,-9223372036854775808)Exits: (-1 + (zext i32
>>>>>                     %n to i64))LoopDispositions: { %for.body:
>>>>>                     Computable }
>>>>>                       %inc = add nuw nsw i64 %x.06, 1
>>>>>                       --> {2,+,1}<nuw><%for.body> U: [2,0) S:
>>>>>                     [2,0)Exits: (zext i32 %n to
>>>>>                     i64)LoopDispositions: { %for.body: Computable }
>>>>>                     Determining loop execution counts for: @func
>>>>>                     Loop %for.body: backedge-taken count is (-2 +
>>>>>                     (zext i32 %n to i64))<nsw>
>>>>>                     Loop %for.body: max backedge-taken count is -2
>>>>>                     Loop %for.body: Predicated backedge-taken
>>>>>                     count is (-2 + (zext i32 %n to i64))<nsw>
>>>>>                      Predicates:
>>>>>                     Loop %for.body: Trip multiple is 1
>>>>>
>>>>>
>>>>>                 Now, I was surprised by the max backedge-taken
>>>>>                 count being -2, and I suspect it is due to the
>>>>>                 backedge-taken count being marked as <nsw> instead
>>>>>                 of <nuw>.
>>>>>
>>>>>                 Is that on purpose, is that a bug, or is my
>>>>>                 analysis incorrect? I am not sure where to fix
>>>>>                 that issue.
>>>>
>>>>                 The backedge-taken count isn't nuw because nsw/nuw
>>>>                 markings aren't flow-sensitive: there isn't any way
>>>>                 to mark the trip count as nuw without marking every
>>>>                 computation of `(long)n-2` as nuw.
>>>>
>>>>                 There's some code in ScalarEvolution::howFarToZero
>>>>                 to try to refine the max backedge-taken count in
>>>>                 some cases, but it isn't very general.  See
>>>>                 https://reviews.llvm.org/D28536 .
>>>>
>>>>                 -Eli
>>>>
>>>>                 -- 
>>>>                 Employee of Qualcomm Innovation Center, Inc.
>>>>                 Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             *Alexandre Isoard*
>>>
>>>
>>>             -- 
>>>             Employee of Qualcomm Innovation Center, Inc.
>>>             Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>
>>>
>>>
>>>         -- 
>>>         *Alexandre Isoard*
>>
>>
>>         -- 
>>         Employee of Qualcomm Innovation Center, Inc.
>>         Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>
>>
>>
>>     -- 
>>     *Alexandre Isoard*
>
>
>     -- 
>     Employee of Qualcomm Innovation Center, Inc.
>     Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
>
>
> -- 
> *Alexandre Isoard*

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/d6e931a4/attachment.html>