[llvm-dev] The nsw story revisited

Wed Jun 28 18:54:21 PDT 2017

Chandler,
              .
I mis-phrased the “nsw” as “llvm.assume”, I really mean it should be thought of
as a combination, not that we should actually represent it that way in the IR,
I think putting it in as an explicit assume would be terribly cumbersome,
just look at it, it is awful to try to read !,  and if I thought it should be literally inserted
I would not have use the non-C notation   (INT_MIN <=  X <= INT_MAX)    !!!
The current implementation as an attribute is best, note that later on I say it is
illegal to hoist the “attribute”, not the “assume”.  As far as leaving the nsw/assume
where the operation was hoisted from, that’s a choice.  As has been noted before
you can delete all the attributes from a program and it still compiles correctly,
The nsw/assume is just extra information that isn’t necessary for correct compilation.

This should be done now because there is absolutely no disagreement now
that this is the correct way to proceed.

As far as “modeling uninitialized memory”, why do you think that is important,
I don’t see this as ever having a benefit to optimization. If folks really think this
is important then my suggestion is that they create new analysis and transform
passes in which they can do what ever they want with respect to uninitialized
memory and other “undefined behaviors”, and “poison” should be implemented
as an analysis attribute, not as an IR instruction.  If and when this effort ever
shows any performance benefit we can discuss what to do about it, but my
bet is that this won’t ever be the case. This might be an interesting academic
research project, but that isn’t any reason to put “poison" into the LangRef.

My point is that “poison” serves no purpose, and must be deleted.  As for
“undef” it is not there to take “poison”s place, it is there to satisfy requirements
of SSA representation of scalar values, and it is illegal to copy-propagate it.
Thats all. It is not there to represent uninitialized memory, since that isn’t an SSA
problem, in fact I don’t believe it is a problem at all.  Uninitialized memory is
an interesting problem for static analyzers, but that isn’t the purpose of LLVM
as I understand it, and if it ever does become so then use the technique above,
use a “poison” attribute, not a “poison” instruction.

Think about it this way, if you were writing such a static analyses would you
replace a reference to uninitialized memory with a “poison” instruction,
no, because doing so would lose information that you want to relay back
to the user. Same thing in LLVM. A “poison” instruction is useless.

Thanks,
Peter Lawrence.

> On Jun 28, 2017, at 6:07 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
> 
> On Wed, Jun 28, 2017 at 12:42 PM Peter Lawrence <peterl95124 at sbcglobal.net <mailto:peterl95124 at sbcglobal.net>> wrote:
> Hal,
> Chandler,
>                Can we get back on topic, IE if you’ve actually read the paper I’d like to 
> hear some feedback on its technical content
> 
> Sure, I was mostly mentioning prior discussion because I don't think this is going to add a lot to the discussion around `nsw` and the `llvm.assume` intrinsic back when Hal and I started discussing modeling assumptions as first class entities in the IR.
> 
>>>> 
>>>> I have been re-reading Dan Gohman's original post "the nsw story" [1]
>>>> and have come to the conclusion that Dan got it wrong in some respects.
>>>> 
>>>> He came up with "no signed wrap" ("nsw") which in his words means
>>>> "The purpose of the nsw flag is to indicate instructions which are
>>>> known to have no overflow".  The key operative word here is "known".
>>>> 
>>>> This means literally an operation with an additional "llvm.assume".
>>>> For example    (a +nsw b)    when a and b are i32   is really
>>>> 
>>>>   ((llvm.assume((INT_MIN <= ((i64)a+(i64)b) <= INT_MAX) , (a + b))
>>>> 
>>>> It is this "assume" that justifies the transformation that Dan does
>>>> to optimize sign-extension of i32 induction variables out of loops
>>>> on LP64 targets.  So far so good.
> 
> 
> I just want to point out that this isn't complete.
> 
> *Before* an `add nsw` instruction is hoisted, it could be thought of as a normal `add` plus `llvm.assume`. But the `llvm.assume` cannot be hoisted above predicates. So what this would end up doing is allowing the hoisting of the `add` by separating the invariant and keeping the invariant below the predicate. This is a fairly minor point, but I didn't want it to be unstated.
>>>> 
>>>> Where Dan goes wrong is in thinking that "+nsw" is an operation
>>>> rather than an operation with an assume, and therefore he feels
>>>> the need to define what the result of this operation is when it
>>>> overflows, which then seems to require a new "poison" instruction
>>>> because "undef" isn't good enough.
> 
> 
> A meta comment. I think it would be helpful for you to phrase your discussion in a less personalized way. Rather than talking about "So-and-so went wrong when they thought ...", I would instead merely discuss the situation as it is today. This helps keep the discussion from sliding into inappropriate areas, which has already been a concern, so it seems worth pointing out as a suggestion for future emails.
> 
> To the actual point, it seems backwards to say that we made a mistake in specifying `nsw` as an operation rather than an `assume`. At the time we specified `nsw` there was no such construct as `llvm.assume`. Perhaps introducing `llvm.assume` would have been better than introducing `nsw`, perhaps not. But it doesn't actually matter. This kind of historical critique doesn't seem like a useful way to frame the discussion.
> 
> The point is, we have `llvm.assume` now, and we can and should consider whether it is a better tool for the job. Hopefully I have understood the point you are trying to make?
> 
> My response is that I don't think this is clearly a superior tool for the job. We did actually discuss things like `nsw` when designing `llvm.assume`. One serious issue with `llvm.assume` is that it *cannot be speculated*. Even though we gain the ability to speculate the `add` by decoupling the invariant, we would retain a large amount of IR that could not be speculated and would cause us to retain control flow and be a significant barrier to optimization.
> 
> In fact, `llvm.assume` is *incredibly* expensive already. We talked at length when introducing it about how it should be relatively sparingly used and used with care to provide the most impactful hints to the optimizer rather than trying to encode everything that might possibly be assumed to be true.
> 
> On the flip side, `add nsw` is extremely common in the IR produced by many frontends. So it would seem like a poor fit to use a very expensive and heavyweight tool to represent something that is very common.
> 
> I'm happy to view `add nsw` as an `add` with an implicit `llvm.assume` inside of control flow where the result is "used" (for a complex definition of "used" that should be discussed on the `undef` thread and not here) if that is a useful notional or theoretical model. But the *representation optimization* of `add nsw` still seems quite useful.
> 
>>>> It therefore makes no sense to discuss the result of "+nsw" as
>>>> ever being either "undef" or "poison", and so the need for "poison"
>>>> is gone.
> 
> Note that there have been examples which show that the `undef` semantics alone as used to model reading from uninitialized memory are effectively equivalent to `poison`. So I don't believe that removing `nsw` would necessarily be sufficient here. I think it would also require other changes. And in your emails about `undef` you seem to agree that we would have to change `undef` itself.
> 
> I think it would be more productive to focus on that thread. If we decide to get rid of poison entirely (including changing `undef`), *then* we can have a discussion about how to model `nsw`.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170628/0b165728/attachment.html>