[llvm-dev] Improving SCEV's behavior around IR level no-wrap flags
Sanjoy Das via llvm-dev
llvm-dev at lists.llvm.org
Fri Sep 23 22:27:04 PDT 2016
Hi Hal,
Hal Finkel wrote:
> My understanding is that there were two problems; this is one of
> them. The second problem is that the SCEV expressions were meant to be
> context insensitive so that the expressions could be compared, and
> expanded, without worrying about potential control dependencies on the
> wrapping flags.
These things are somewhat confusing to precisely reason about because
poison only has informal semantics today, but I think these problems
are related and the new proposal addresses the issue you raised.
> An attempt at an example...
>
> for (int i = 0; i< n; ++i)
> for (int j = 0; j< m + q; ++j) {
> array[j][i] = i + j;
> }
Explicitly, your example is
for (int i = 0; i < n; ++i) {
int tmp = m +nsw q;
for (int j = 0; j < tmp; ++j) {
array[j][i] = i + j;
}
}
(Just to make things 100% clear, in the new scheme getSCEV(tmp) will
be nsw, but getAddExpr(getSCEV(m), getSCEV(q)) won't be.)
Firstly, I don't see any problems with materializing "add nsw m, q"
outside the loop per-se, that's not any different from LICM'ing out
the nsw addition from the loop. However, we're in trouble if the
materialized expression (outside the loop, via SCEVExpnader) generates
poison in a way that causes UB when there wouldn't have been any
before.
Assuming SCEVExpander is used in "reasonable" ways, I don't think we
will introduce undefined behavior. For instance, rewriting loop exit
values:
int last = 0;
for (int i = 0; i < n; ++i) {
int tmp = m +nsw q;
for (int j = 0; j < tmp; ++j) {
last = j;
}
}
print(last);
==>
int last = 0 < n ? (m +nsw q) : 0;
print(last);
is fine with a sign-overflowing (m +nsw q), since we're using poison
in a side effect only if 0 < n, in which case the original program
would have had undefined behavior also.
The same is true for loop-interchange type optimizations:
for (int i = 0; i < n; ++i) {
int tmp = m +nsw q;
for (int j = 0; j < tmp; ++j) {
array[j][i] = i + j;
}
}
==>
int tmp = m +nsw q;
for (int j = 0; j < tmp; ++j) {
for (int i = 0; i < n; ++i) {
array[j][i] = i + j;
}
}
seems fine on a sign-overflowing m +nsw q since in both the original
and final program, there is a side effect dependent on poison iff
"0 < n".
> The fact that (m+q) does not wrap may have a control dependency on
> n> 0. If we want, for example, to perform loop interchange, we need to
> drop the nsw on the (m+q) when we materialize the trip-count
> expression outside of the input's outer loop. Maybe we can just handle
> this issue in the expander, but I suspect we need to be careful in
> other contexts as well, for example, if we have:
>
> for (int i = 0; i< n + q; ++i)
> for (int j = 0; j< m + q; ++j) {
> array[j][i] = i + j;
> }
>
> and I want to construct a comparison between (n+q)< (m+q); I need to
> be careful about just using the nsw on (n+q) and (m+q) to simplify
> this to be n< m because the nsw on (m+q) is really only dependable if
> n + q> 0.
Written explicitly, your example looks like (with a minor addition):
int tmp0 = n +nsw q;
for (int i = 0; i < tmp0; ++i) {
int tmp1 = m +nsw q;
for (int j = 0; j < tmp1; ++j) {
array[j][i] = i + j;
}
array2[int(tmp0 < tmp1)] = 40;
}
Now simplifying this to
int tmp0 = n +nsw q;
for (int i = 0; i < tmp0; ++i) {
int tmp1 = m +nsw q;
for (int j = 0; j < tmp1; ++j) {
array[j][i] = i + j;
}
array2[int(n < m)] = 40;
}
is fine, since replacing tmp0 < tmp1 with n < m changes observable
behavior only if 0 < tmp0 and either tmp0 or tmp1 sign-overflowed; and
in those cases the original program was undefined.
However, if you instead do:
isKnownPredicate(ICMP_SLT,
getAdd(getSCEV(n), getSCEV(q)),
getAdd(getSCEV(m), getSCEV(q)))
then you'll get what you said -- (n + q) s< (m + q), not (n +nsw q) s<
(m +nsw q).
-- Sanjoy
More information about the llvm-dev
mailing list