[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Thu Aug 16 16:34:55 PDT 2018
In general, the backedge-taken count is an unsigned value; it's possible
to write a loop with a trip count of 2^64 using a 64-bit induction
variable. To prove your loop has a "small" trip count, you have to use
either the guard or the nsw/nuw markings on the induction variable.
-Eli
On 8/16/2018 4:09 PM, Alexandre Isoard wrote:
> Ok.
>
> To go back to the original issue, would it be meaningful to add a
> SCEVUMax(0, BTC) on the final BTC computed by SCEV?
>
> So that it does not use "negative values"?
>
> On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at codeaurora.org
> <mailto:efriedma at codeaurora.org>> wrote:
>
> On 8/15/2018 2:27 PM, Alexandre Isoard wrote:
>> I'm not sure I understand the poison/undef/UB distinctions.
>>
>> But on this example:
>>
>> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {
>> entry:
>> %adds = add nsw i32 %x, %y
>> %addu = add nuw i32 %x, %y
>> %cond = select i1 %b, i32 %adds, i32 %addu
>> ret i32 %cond
>> }
>>
>>
>> It is important to not propagate the nsw/nuw between the two SCEV
>> expressions (which unification would do today, can I consider
>> that a bug or is it a feature?).
>
> It's an intentional design choice.
>
>> So we work-around it by not informing SCEV of the flags:
>>
>> Printing analysis 'Scalar Evolution Analysis' for function
>> 'func':
>> Classifying expressions for: @func
>> %adds = add nsw i32 %x, %y
>> --> (%x + %y) U: full-set S: full-set
>> %addu = add nuw i32 %x, %y
>> --> (%x + %y) U: full-set S: full-set
>> %cond = select i1 %b, i32 %adds, i32 %addu
>> --> %cond U: full-set S: full-set
>> Determining loop execution counts for: @func
>>
>>
>> Would there be problems if we properly considered nuw/nsw flags
>> when unifying SCEVs?
>
> There would be other consequences. For example, `(%x + %y)<nsw>`
> and `(%x + %y)<nuw>` wouldn't compare equal for other
> simplifications, and all the places that call setNoWrapFlags would
> have to be rewritten. It's probably possible to come up with some
> workable design, but nobody has actually tried it, so it's not
> clear how much work it would be to implement, or whether it would
> improve the generated code overall.
>
> -Eli
>
>>
>> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli
>> <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>> wrote:
>>
>> On 8/15/2018 1:31 PM, Alexandre Isoard wrote:
>>> Is that why we do not deduce +<nsw> from "add nsw" either?
>>
>> Essentially, yes.
>>
>>> Is that an intrinsic limitation of creating a
>>> context-invariant expressions from a Value* or is that a
>>> limitation of our implementation (our unification not
>>> considering the nsw flags)?
>>
>> It's a consequence of unification not considering nsw. (nsw
>> on an instruction is naturally invariant because violating
>> nsw produces poison, not UB.)
>>
>> -Eli
>>
>>>
>>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli
>>> <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>>
>>> wrote:
>>>
>>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote:
>>>> Hello,
>>>>
>>>> If I run clang on the following code:
>>>>
>>>> void func(unsigned n) {
>>>> for (unsigned long x = 1; x < n; ++x)
>>>> dummy(x);
>>>> }
>>>>
>>>>
>>>> I get the following llvm ir:
>>>>
>>>> define void @func(i32 %n) {
>>>> entry:
>>>> %conv = zext i32 %n to i64
>>>> %cmp5 = icmp ugt i32 %n, 1
>>>> br i1 %cmp5, label %for.body, label %for.cond.cleanup
>>>> for.cond.cleanup: ; preds =
>>>> %for.body, %entry
>>>> ret void
>>>> for.body: ; preds = %entry, %for.body
>>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>> tail call void @dummy(i64 %x.06) #2
>>>> %inc = add nuw nsw i64 %x.06, 1
>>>> %exitcond = icmp eq i64 %inc, %conv
>>>> br i1 %exitcond, label %for.cond.cleanup, label
>>>> %for.body
>>>> }
>>>>
>>>>
>>>> Over which, SCEV will provide the following analysis:
>>>>
>>>> Printing analysis 'Scalar Evolution Analysis' for
>>>> function 'func':
>>>> Classifying expressions for: @func
>>>> %conv = zext i32 %n to i64
>>>> --> (zext i32 %n to i64) U: [0,4294967296) S:
>>>> [0,4294967296)
>>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>> --> {1,+,1}<nuw><nsw><%for.body> U:
>>>> [1,-9223372036854775808) S:
>>>> [1,-9223372036854775808)Exits: (-1 + (zext i32 %n
>>>> to i64))LoopDispositions: { %for.body: Computable }
>>>> %inc = add nuw nsw i64 %x.06, 1
>>>> --> {2,+,1}<nuw><%for.body> U: [2,0) S:
>>>> [2,0)Exits: (zext i32 %n to i64)LoopDispositions: {
>>>> %for.body: Computable }
>>>> Determining loop execution counts for: @func
>>>> Loop %for.body: backedge-taken count is (-2 + (zext
>>>> i32 %n to i64))<nsw>
>>>> Loop %for.body: max backedge-taken count is -2
>>>> Loop %for.body: Predicated backedge-taken count is
>>>> (-2 + (zext i32 %n to i64))<nsw>
>>>> Predicates:
>>>> Loop %for.body: Trip multiple is 1
>>>>
>>>>
>>>> Now, I was surprised by the max backedge-taken count
>>>> being -2, and I suspect it is due to the backedge-taken
>>>> count being marked as <nsw> instead of <nuw>.
>>>>
>>>> Is that on purpose, is that a bug, or is my analysis
>>>> incorrect? I am not sure where to fix that issue.
>>>
>>> The backedge-taken count isn't nuw because nsw/nuw
>>> markings aren't flow-sensitive: there isn't any way to
>>> mark the trip count as nuw without marking every
>>> computation of `(long)n-2` as nuw.
>>>
>>> There's some code in ScalarEvolution::howFarToZero to
>>> try to refine the max backedge-taken count in some
>>> cases, but it isn't very general. See
>>> https://reviews.llvm.org/D28536 .
>>>
>>> -Eli
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>
>>>
>>>
>>> --
>>> *Alexandre Isoard*
>>
>>
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>
>>
>>
>> --
>> *Alexandre Isoard*
>
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
>
>
> --
> *Alexandre Isoard*
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/bd87ff06/attachment.html>
More information about the llvm-dev
mailing list