[llvm-dev] Inferring nsw/nuw flags for increment/decrement based on relational comparisons

Tue Sep 20 12:05:17 PDT 2016

Hi everyone,

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
https://llvm.org/bugs/show_bug.cgi?id=30428 ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

 > 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
 > 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

This is basically what I might be interested in implementing, or at
least mapping out an implementation approach for, for the sake of
eliminating some bounds checks in Rust that the Go compiler is
apparently able to eliminate:
https://github.com/rust-lang/rust/issues/35981

I'll quote my questions about the possible implementation below and try
to add some more context.

 > DominatorTree and AssumptionCache do seem to be sufficient and already
 > present. computeOverflowForSignedAdd and computeOverflowForUnsignedAdd
 > look like the kind of places to extend with this analysis based on
 > that info.

ValueTracking has access to DominatorTree and AssumptionCache
information, and provides the named two functions. InstCombine calls
them and sets nsw/nuw flags based on the result.

 > Two questions about that approach though:
 >
 > 1. Going back to the predsimplify problems: performance. Walking up
 > the CFG (even only up to the MaxDepth = 6 defined in
 > ValueTracking.cpp) and doing dominance queries on certain icmps seems
 > like a fairly expensive operation to perform on every add/sub
 > instruction. At the minimum I'd imagine the icmps that we know to be
 > true should be cached for the basic block. ...and this is more or less
 > equivalent to the suggestion about llvm.assume generation that was
 > shot down in the discussion you linked?

The link in question is the following:
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080666.html

Perhaps these operations are cheaper than I think and such caching is
not needed? Alternatively they could be put behind -O3 i.e. the
ExpensiveCombines variable Sanjay pointed out.

 > 2. InstCombiner::visitAdd only calls into ValueTracking for the
 > unsigned case, i.e. computeOverflowForUnsignedAdd. There are no
 > computeOverflowFor*Sub functions that InstCombiner::visitSub even
 > could make use of. Instead, InstCombiner has its own
 > WillNotOverflow{S,UnS}igned{Add,Sub} functions. The relationship
 > between these and the computeOverflow* functions is unclear to me.
 > They look like they overlap to an extent.

Is some refactoring warranted here or does this mean that the
InstCombiner functions and not the ValueTracking ones are the ones to
edit? It seems a bit strange that InstCombine has extra logic here since
ValueTracking is used from other locations as well and seems like the
right place to provide the ultimate truth. InstCombiner::visitAdd has a
nearby TODO implying performance concerns, is that what it's about?

— Matti