[PATCH] Teach IndVarSimplify to add nuw and nsw to operations that provably don't overflow.

Mon Dec 29 17:06:04 PST 2014

A quick comment that might be relevant:  the IR generated by a Java
frontend will usually only (roughly speaking) have nsw and nuw's that
LLVM inserted after proving that an operation does not overflow.  This
is because semantically integers in Java have wrapping overflow
semantics so most additions (say) will not be not be nsw or nuw to
begin with.

On the other hand, a C frontend can and should be emitting all (most?)
signed additions (and other potentially overflowing operators) as nsw
since signed overflow is UB in C.  There is no way LLVM can rediscover
this information once the nuw / nsw is removed (except in very
specific cases, like in Java).

Anyway, now back to being "on vacation" and not thinking about compilers.

On Mon, Dec 29, 2014 at 3:50 PM, Philip Reames
<listmail at philipreames.com> wrote:
>
> On 12/29/2014 03:33 PM, Chandler Carruth wrote:
>
>
> On Mon, Dec 29, 2014 at 3:27 PM, Philip Reames <listmail at philipreames.com>
> wrote:
>>
>> Well in essence that is what we are currently doing, usually by stripping
>> the flags in the DAG.
>>
>> I think I wasn't quite clear enough.  We currently run InstCombine in
>> multiple places in the pass ordering.  What if we had early runs of inst
>> combine preserve flags as the first priority.  The runs closer to CGP would
>> instead prioritize any legal optimization, regardless of whether it
>> preserved flags or not.  Another way to state this would be that we simply
>> pull the point we (potentially) drop flags a bit earlier in the pipeline -
>> in particular, before the last point we run instcombine.
>
>
> We have the DAG Combiner which is essentially the last combine run. It's not
> InstCombine, but in theory this is where we would make such transformations
> today.
>
> What I'm trying to say is that we already have this in principle, we've just
> pegged the "when" to be very late. We could move it forward if we had some
> significant evidence doing so helped lots of things without making anything
> worse, I just don't think anyone really has that kind of information.
>
> This of course doesn't help the fact that there are two competing canonical
> forms. We end up picking one in the core of the pipeline and we will always
> see tradeoffs from whichever form we pick (for example in knock-on
> simplifications).
>
> We're in rough agreement here.
>>
>>
>> This doesn't really get us a butter practical result because the point of
>> making the transform in instcombine is to allow the result to in turn
>> re-combine with something else interestingly.
>>
>> The "solution" I really know for this kind of thing is to explode the
>> space by tracking both forms. We actually do some of this today. If you have
>> an outer combine that needs an inner combine that strips flags, you just
>> write the outer combine to handle the inner expression in its more complex
>> flag-bearing form. The problem is that this grows the number of patterns
>> necessary for the outer combine in a combinatorial way, so there are very
>> severe limits on how much of this we can do.
>>
>> So essentially this comes down to a case by case profitability analysis
>> and there's no good answer.  Understood.  :)
>>
>> One interesting observation is that, so far, the cases I've seen where
>> missing flags inhibit optimization, those flags can be re-derived from the
>> IR in it's current form (with sufficient effort).
>
>
> I don't follow. The transforms which would require dropping flags would fire
> in many more places than just the one that gave rise to this thread, and
> would drop flags that we have no way of recomputing.
>
> I'm in complete agreement with the fact the general issue is hard.  But in
> this and several other cases I've seen that actually mattered to me - mostly
> in relation to bounds check removal - we weren't actually seeing the general
> case.  I'm not saying my sample is in any way meaningful - it's not - but I
> found the observation interesting.
>
> Note: I'm not stating we dropped flags in these cases; I don't know that.
> I'm only stating that there were flags we could infer which gave interesting
> results after instcombine had already run *w/o the flags being present*.
>
>
> I don't think we're preserving flags in any cases where there is another
> canonical form that is mathematically simpler and inherently preserves the
> semantics. The problem is that the "simpler" form (thinking in terms of
> arithmetically simpler) specifically makes the flags impossible to recover.
>
> (I'm into the territory of debating something I know little about which is
> always dangerous...)
>
> But do we loose actually interesting optimizations by dropping the flags?
> That's the interesting question which I have no real data on.  You may.  It
> sounds you believe the answer is yes.  If you happen to have a compelling
> example, let me know.  I'd be curious to see it, but please don't spend much
> time.  We're well off into a non-actionable tangent at this point.
>
> Philip
>
>