[llvm-dev] Improving SCEV's behavior around IR level no-wrap flags

Sat Sep 24 08:10:31 PDT 2016

----- Original Message -----
> From: "Sanjoy Das" <sanjoy at playingwithpointers.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Andrew Trick" <atrick at apple.com>, "Dan Gohman" <dan433584 at gmail.com>,
> "Chandler Carruth" <chandlerc at gmail.com>, "David Majnemer" <david.majnemer at gmail.com>, "Sebastian Pop"
> <sebpop at gmail.com>
> Sent: Saturday, September 24, 2016 12:27:04 AM
> Subject: Re: Improving SCEV's behavior around IR level no-wrap flags
> 
> Hi Hal,
> 
> Hal Finkel wrote:
>  > My understanding is that there were two problems; this is one of
>  > them. The second problem is that the SCEV expressions were meant
>  > to be
>  > context insensitive so that the expressions could be compared, and
>  > expanded, without worrying about potential control dependencies on
>  > the
>  > wrapping flags.
> 
> These things are somewhat confusing to precisely reason about because
> poison only has informal semantics today, but I think these problems
> are related and the new proposal addresses the issue you raised.

Thanks! I think this makes sense. You do need to make sure that you only use the result of an expression (where use means in some instruction that can actually cause UB) under the same set of conditions that predicate all control dependencies of the original IR-level expressions. This seems likely to be true for semantics-preserving transformations (e.g. we might compute some tiling factors which contain poison values, but it does not matter because, in all cases where poison values might be generated, one of the loop trip counts was zero and so the tiling factors are irrelevant), but we should make sure to capture this requirement explicitly.

 -Hal

> 
>  > An attempt at an example...
>  >
>  > for (int i = 0; i<  n; ++i)
>  >   for (int j = 0; j<  m + q; ++j) {
>  >     array[j][i] = i + j;
>  >   }
> 
> Explicitly, your example is
> 
> for (int i = 0; i < n; ++i) {
>    int tmp = m +nsw q;
>    for (int j = 0; j < tmp; ++j) {
>      array[j][i] = i + j;
>    }
> }
> 
> (Just to make things 100% clear, in the new scheme getSCEV(tmp) will
> be nsw, but getAddExpr(getSCEV(m), getSCEV(q)) won't be.)
> 
> Firstly, I don't see any problems with materializing "add nsw m, q"
> outside the loop per-se, that's not any different from LICM'ing out
> the nsw addition from the loop.  However, we're in trouble if the
> materialized expression (outside the loop, via SCEVExpnader)
> generates
> poison in a way that causes UB when there wouldn't have been any
> before.
> 
> Assuming SCEVExpander is used in "reasonable" ways, I don't think we
> will introduce undefined behavior.  For instance, rewriting loop exit
> values:
> 
>    int last = 0;
>    for (int i = 0; i < n; ++i) {
>      int tmp = m +nsw q;
>      for (int j = 0; j < tmp; ++j) {
>        last = j;
>      }
>    }
>    print(last);
> 
> ==>
> 
>    int last = 0 < n ? (m +nsw q) : 0;
>    print(last);
> 
> is fine with a sign-overflowing (m +nsw q), since we're using poison
> in a side effect only if 0 < n, in which case the original program
> would have had undefined behavior also.
> 
> The same is true for loop-interchange type optimizations:
> 
>    for (int i = 0; i < n; ++i) {
>      int tmp = m +nsw q;
>      for (int j = 0; j < tmp; ++j) {
>        array[j][i] = i + j;
>      }
>    }
> 
> ==>
> 
>    int tmp = m +nsw q;
>    for (int j = 0; j < tmp; ++j) {
>      for (int i = 0; i < n; ++i) {
>        array[j][i] = i + j;
>      }
>    }
> 
> seems fine on a sign-overflowing m +nsw q since in both the original
> and final program, there is a side effect dependent on poison iff
> "0 < n".
> 
>  > The fact that (m+q) does not wrap may have a control dependency on
>  > n> 0. If we want, for example, to perform loop interchange, we
>  > need to
>  > drop the nsw on the (m+q) when we materialize the trip-count
>  > expression outside of the input's outer loop. Maybe we can just
>  > handle
>  > this issue in the expander, but I suspect we need to be careful in
>  > other contexts as well, for example, if we have:
>  >
>  > for (int i = 0; i<  n + q; ++i)
>  >   for (int j = 0; j<  m + q; ++j) {
>  >     array[j][i] = i + j;
>  >   }
>  >
>  > and I want to construct a comparison between (n+q)< (m+q); I need
>  > to
>  > be careful about just using the nsw on (n+q) and (m+q) to simplify
>  > this to be n< m because the nsw on (m+q) is really only dependable
>  > if
>  > n + q> 0.
> 
> Written explicitly, your example looks like (with a minor addition):
> 
>    int tmp0 = n +nsw q;
>    for (int i = 0; i < tmp0; ++i) {
>      int tmp1 = m +nsw q;
>      for (int j = 0; j < tmp1; ++j) {
>        array[j][i] = i + j;
>      }
>      array2[int(tmp0 < tmp1)] = 40;
>    }
> 
> Now simplifying this to
> 
>    int tmp0 = n +nsw q;
>    for (int i = 0; i < tmp0; ++i) {
>      int tmp1 = m +nsw q;
>      for (int j = 0; j < tmp1; ++j) {
>        array[j][i] = i + j;
>      }
>      array2[int(n < m)] = 40;
>    }
> 
> is fine, since replacing tmp0 < tmp1 with n < m changes observable
> behavior only if 0 < tmp0 and either tmp0 or tmp1 sign-overflowed;
> and
> in those cases the original program was undefined.
> 
> However, if you instead do:
> 
>    isKnownPredicate(ICMP_SLT,
>                     getAdd(getSCEV(n), getSCEV(q)),
>                     getAdd(getSCEV(m), getSCEV(q)))
> 
> then you'll get what you said -- (n + q) s< (m + q), not (n +nsw q)
> s<
> (m +nsw q).
> 
> -- Sanjoy
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory