[llvm-dev] Improving SCEV's behavior around IR level no-wrap

Sun Aug 13 22:32:46 PDT 2017

Hi,

On Wed, Aug 9, 2017 at 12:27 PM, Chawla, Pankaj <pankaj.chawla at intel.com> wrote:
> I am not blocked by it. I was just wondering whether preserving no-wrap flags in SCEV can help produce better code using SCEVExpander.  What I mean is that if I construct SCEV out of incoming optimized IR and then use SCEVExpander to generate code, can I recover the optimized IR back using some combination of peephole optimizations (like instcombine) on the generated code? Or are we going to lose performance due to missing no-wrap flags?

I'm fairly certain that there are cases where round-tripping through
SCEV -> SCEVExpander will lose information for good in clang-generated
IR, since most of the NSW flags in clang-generated IR come language
axioms (signed integer overflow is UB) that can't be re-proved by any
analysis once they've been erased.

-- Sanjoy

>
> Unfortunately, I do not have a test case to demonstrate that we may lose performance in some cases.
>
> Thanks,
> Pankaj
>
>
> -----Original Message-----
> From: Andrew Trick [mailto:atrick at apple.com]
> Sent: Tuesday, August 08, 2017 8:30 PM
> To: Sanjoy Das <sanjoy at playingwithpointers.com>
> Cc: Chawla, Pankaj <pankaj.chawla at intel.com>; llvm-dev at lists.llvm.org
> Subject: Re: Improving SCEV's behavior around IR level no-wrap
>
>
>> On Aug 8, 2017, at 5:34 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>
>> Hi Pankaj,
>>
>> IIRC there was pushback on this proposal so I did not proceed further.
>> Are you blocked on this?
>>
>> [+CC Andy, who I remember had some objections.]
>>
>> — Sanjoy
>
> Off the top of my head, my concern is that expression comparison is no longer constant time, which I think is fundamental to SCEV.
>
> I may be able to dig through my notes next week, after vacation...
>
> -Andy
>
>> On Tue, Aug 8, 2017 at 3:06 PM, Chawla, Pankaj <pankaj.chawla at intel.com> wrote:
>>> Hi Sanjoy,
>>>
>>> Any update on this?
>>> Are there plans to implement this proposal?
>>>
>>> Thanks,
>>> Pankaj
>>>
>>>
>>> -----Original Message-----
>>> Date: Fri, 23 Sep 2016 02:09:19 -0700
>>> From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org>
>>> To: llvm-dev <llvm-dev at lists.llvm.org>, Andrew Trick
>>>        <atrick at apple.com>,  Dan Gohman <dan433584 at gmail.com>, Hal Finkel
>>>        <hfinkel at anl.gov>,  Chandler Carruth <chandlerc at gmail.com>, David
>>>        Majnemer <david.majnemer at gmail.com>,  Sebastian Pop
>>> <sebpop at gmail.com>
>>> Subject: [llvm-dev] Improving SCEV's behavior around IR level no-wrap
>>>        flags
>>> Message-ID:
>>>
>>> <CAMiUf7fs5xDnfaChLEcft+auNoVW_LksqLA48KJnH3rNgcMftQ at mail.gmail.com>
>>> Content-Type: text/plain; charset=UTF-8
>>>
>>> Hi all,
>>>
>>> This is about a project I've been prototyping on-and-off for a while that has finally reached a point where I can claim it to be "potentially viable".  I'd like to gather some input from the community before moving too far ahead.
>>>
>>>
>>> # The problem
>>>
>>> There is a representation issue within SCEV that prevents it from
>>> fully using information from nsw/nuw flags present in the IR.  This
>>> isn't just a theoretical issue, e.g. today LLVM won't unroll this
>>> loop:
>>>
>>> void f(int x, long* arr) {
>>>  for (int i = x + 1; i < x + 3; i++)
>>>    arr[i] = 40;
>>> }
>>>
>>> since SCEV is unable to exploit the no-overflow on x+1 and x+3 to prove that the loop only runs twice.
>>>
>>> The fundamental problem here is that SCEV expressions are unique'd but the nsw/nuw flags on SCEV expressions are not part of the key they're unique'd by.  Instead, nsw/nuw flags on SCEV expressions are expressed by mutating the SCEV expressions in place.
>>>
>>> This means "add %x, 1" and "add nsw %x, 1" both map to the _same_ SCEV expression (that is, literally the same SCEV* object), and we can't mutate the common SCEV expression to flag it as nsw since that will incorrectly denote "add %x, 1" as nsw too.
>>>
>>> In general, this means SCEV has to be very conservative about marking SCEV expressions as no-wrap.  In some cases (e.g. the loop above), this ends up being excessively conservative.
>>>
>>> One path forward is to have SCEV try to prove that if a certain
>>> operation produces poison, the program definitely has undefined
>>> behavior.  This can let us mutate the corresponding SCEV objects to
>>> pull the "nsw"-ness into SCEV.  For instance, if we have
>>>
>>>  %x = load ...
>>>  %t = add i32 nsw %x, 1
>>>  %addr = gep(%array, %t)
>>>  store i32 0, %addr
>>>  %t2 = add i32 %x, 1
>>>
>>> then transferring NSW to getSCEV(%t) is okay, since even though %t2 (which will be mapped to the same SCEV expression as %t) does not have "nsw" on the instruction, we know adding 1 to %x cannot overflow since the program would have UB otherwise.
>>>
>>> Bjarke Roune has implemented some of this. However, this is difficult
>>> to do for cases like the x+1 .. x+3 loop above without running a
>>> control flow analysis over the entire function.  And this approach
>>> does not work in the presence of function calls or general control
>>> flow, like
>>>
>>>  %x = load ...
>>>  %t = add i32 nsw %x, 1
>>>  call void @f()
>>>  %addr = gep(%array, %t)
>>>  store i32 0, %addr
>>>
>>> or
>>>
>>>  %x = load ...
>>>  %t = add i32 nsw %x, 1
>>>  if (<condition>)
>>>    return;
>>>  %addr = gep(%array, %t)
>>>  store i32 0, %addr
>>>
>>> since unless the side-effecting use of %t (the store) "strongly"[1] post dominates the def of %x, there is no guaranteed undefined behavior on a poisonous %t.  Things are even more complex if %x is not a load, but an expression SCEV an look through, like an add or a shift by a constant.
>>>
>>> *I think the current representation of nsw/nuw in SCEV expressions is
>>> not congruent with LLVM's specification of poison values, and that is
>>> blocking us from exploiting poison values as intended by LLVM's
>>> design.*
>>>
>>>
>>>
>>> # The proposed solution
>>>
>>> Since poison values are, well, _values_, I propose we model them as data within SCEV.  We treat nsw/nuw flags as "operands" since they contribute to the result of an SCEV expression just like normal inputs to the expression.
>>>
>>> This means we'd treat "add %x, %y" as a different SCEV expression than "add nsw %x, %y", since the latter sometimes produces poison while the former doesn't.  The latter would be known to not overflow, and SCEV would use that fact in the usual ways.
>>>
>>> With this change SCEV expressions will be pointer equal less often, and while relying on pointer equality for value equality will be correct, it will be somewhat pessimistic; and we'll have to implement and use some form of structural equality.
>>>
>>> In other words, some places that do
>>>
>>>  SCEV *X = ...
>>>  SCEV *Y = ...
>>>  if (X == Y)
>>>    ...
>>>
>>> will have to be changed to do
>>>
>>>  SCEV *X = ...
>>>  SCEV *Y = ...
>>>  if (X->equals(Y))
>>>    ...
>>>
>>> This has potential for compile-time regressions.  Hopefully they'll all be addressable.
>>>
>>> There are cases in which SCEV (via trip count analysis, say) can _prove_ that a certain expression does not overflow.  In those cases we will mutate the SCEV expression to indicate no-wrap; since the no-wrap flag is just a "cache" of a proof based on the structure of the SCEV expression, and _does_ apply to all SCEV expressions with the same shapes.
>>>
>>> Concretely, we'll endow relevant SCEV expression types with two sets distinct of flags:
>>>
>>> - AxiomaticFlags: These flags follow from nsw/nuw annotations in the
>>>   IR.  These will be part of the key the SCEV expression is unique'd
>>>   on.
>>> - ComputedFlags: These flags are derived from the structure of the
>>>   SCEV expression, and they're *not* a part of the key the SCEV
>>>   expression is unique'd on.
>>>
>>> For the purposes of consumption, there will be no difference between AxiomaticFlags and ComputedFlags.  Consumers will get a union of the two when they ask for the set of flags that apply to a specific SCEV expression.
>>>
>>> ComputedFlags will, in general, depend on AxiomaticFlags.  For instance if AxiomaticFlags is "nsw" for, say, {1,+,1}, we can add "nuw" to its ComputedFlags.  There is no need to further distinguish "{1,+,1}-axiomatic<nsw>" on the computed<nuw> dimension since "{1,+,1}-axiomatic<nsw>" will always be computed<nuw>.
>>>
>>>
>>>
>>>
>>> What do you think?  Does the overall picture here make sense?
>>>
>>> Alternate solutions are also more than welcome (especially if they're easier to implement!).
>>>
>>> Thanks,
>>> -- Sanjoy
>>>
>>> [1]: That is, it the store is guaranteed to execute once the load has
>>> been issued.
>>>
>>>
>>> ------------------------------
>>>
>>>
>