[LLVMdev] ScalarEvolution::createNodeForPHI

Thu Oct 3 11:06:26 PDT 2013

On 10/03/2013 06:55 PM, Andrew Trick wrote:
> 
> On Oct 3, 2013, at 7:15 AM, Michele Scandale <michele.scandale at gmail.com> wrote:
> 
>> On 10/03/2013 01:22 AM, Andrew Trick wrote:
>>>
>>> I’m not sure how to make sense of an NUW flag coming from a sub.
>>>
>>> NSW is just a misnomer for signed overflow. SCEV canonicalizes sub a,b to add a, (-b). Unfortunately, signed overflow is not the same thing for sub and add...
>>>
>>> sub nsw %a, %b != add nsw %a, (-1 * %b)
>>>
>>> sub nsw i32, -1, INT_MIN is true.
>>>
>>> add nsw i32, -1, (-1 * INT_MIN) is false.
>>>
>>> NSW flags just aren’t an effective way to tell SCEV about overflow. Ideally we could reason more generally about the loop’s undefined behavior. For example, I’d like to query the range of values that a variable can hold before the loop hits undefined behavior. I don’t know how to implement something like this yet. Ideas are welcome.
>>>
>>> Reading this code again, even the addition case looks too aggressive to me.
>>> See:
>>> llvm.org/PR17452 - SCEV createNodeForPHI is too optimistic about NSW.
>>>
>>
>> If the behavior is different between 'add nsw' and 'sub nsw' then the SCEV
>> canonicalization of 'sub' leads to incorrect results.
> 
> Creating a SCEV expression for a sub instruction drops the flags. It’s safe, but the flags are useless.
> 
>> In your example both result should be the same but the 'add' produces a poison
>> value. Is this what we really want?
> 
> By true/false in the example above I meant defined/undefined. Since we can't introduce undefined behavior when none previously existed, we drop the nsw flags during canonicalization. If we could prove that the second operand is > INT_MIN, then we could preserve the flags. But your example has variable stride.
> 
>> The case of an expression with mixed operator ('add' with and without nsw) flag
>> would require an explicit representation of the two operator making more complex
>> reassociation, simplification and reordering of the expression operands.
> 
> Yes, in general an attempt to preserve NSW flags would defeat the simple utility of SCEV. That’s why we drop the flags all over the place.
> 
>> Your proposed workaround of checking that the BEValue is an operation between
>> the PHINode Value and a loop invariant would be fine but it's still a workaround.
> 
> This fixes a correctness problem I noticed in the same bit of code (PR17452). My suggestion was to be conservative in preserving NSW. It’s not about missing optimizations.
> 
> I don’t have a concrete suggestion for harder problem of missing optimization. I can say that we need something better than embedding flags within SCEV expressions. PR16358 is a recent example of the general problem (there have been other unsolved cases that I can’t find now). Arnold made the point here that SCEV needs to infer the legal range of one expression from another.
> 
> My very abstract suggestion for future improvements are:
> - At the SCEV level, logic to deduce that some recurrences cannot wrap given a set of loop facts derived from the IR and represented independent from the SCEV expressions themselves.

I agree with this. I noticed that value ranges are computed independently from
the "context" of the instruction related to the SCEV: for loop analysis we may
exploit information from facts dominating the loop execution. Maybe it would be
easier to map to a given IR Value a set of SCEV depending on the DomTree node we
are considering (given a variable %x, an if-then-else on condition "%x slt 0",
in one branch the range for %x would be [1, INT_MAX] in the other [INT_MIN, 0]).

> - At the IR level, rely on some sort of trap-on-overflow intrinsic independent from the arithmetic operations that can be carried through canonicalization, reassociation etc.  This could be done today with existing overflow intrinsics, but modeling the trap would require a branch.

Why this? Do we need to handle also trapping arithmetic?

> I suggest filing a bug for your case and continuing discussion there. It may be related to PR12377: Loop trip count not calculated.

I've done a little rewrite of functions to compute trip count for LT and GT
comparisons. However that case is not handled because of the missing NSW flag in
the SCEV. So probably it would be better to proposed a patch with my
modification before.

-Michele