[llvm-dev] ScalarEvolution invariants around wrapping flags

Tue Sep 17 06:10:03 PDT 2019

Hi,

I'm working on a bug somewhere between SCEV and IndVarSimplify, which
tacks an unwanted "nuw" onto an "add i32 %whatever, -1" (which
actually almost certainly will overflow), leading ultimately to an
infinite loop. A fuller description and test-case is at the end for
anyone interested.

The issue seems to be with ScalarEvolution's attempts to cache SCEV
objects, which don't include wrapping flags in the immutable state.
That means that once ScalarEvolution has created an expression as
"nuw" it will be that way forever, even if analysed from an entirely
separate part of the function where that's not necessarily the case.

Worse, callers can create an expression with specified flags through
the public interface, which would pollute all analysis after that
point with those unwanted flags. I don't know if any callers actually
do this but I could see it being useful for looking at hypothetical
cases.

To some degree mutation of wrapping flags is part of the design,
because SCEVAddRecExprs are explicitly const_casted to add flags in
multiple places so that they can be found again later. But that might
not be quite so harmful because they at least contain a Loop as a
contextual cue that prevents some leaking.

So, my real question is does anyone know what the contract with
ScalarEvolution is? I see a few possibilities:

1. Callers are expected to not engage in speculation. ScalarEvolution
itself must only create expressions it knows hold in all cases. This
sounds too restrictive to me.
2. Speculation not allowed outside ScalarEvolution, but
ScalarEvolution can speculate about a specific Loop. I think this
entails making non-AddRec expressions immutable (with Flags included
as part of the FoldingSetID) and ensuring that any modification of an
AddRec is provable within its Loop.
3. Speculation is allowed, and achieved by making all expressions
immutable. Sites currently const_casting AddRecExprs instead get a new
one with the flags they want. Sites trying to help out other codepaths
by cacheing this info are out of luck.
4. Speculation scenarios are allowed, and achieved by adding something
like a "HypothesisToken" to SCEV objects, to keep them separate from
each other and guaranteed properties. As in 2, maybe a Loop is enough
if public users can't speculate.

Any ideas gratefully received.

Cheers.

Tim.

Description of bug: the loop gets duplicated, with one copy guarded by
both "x == 0 && x != 0". This gets an "nuw" added to %dec (probably
legitimately since it's never executed). That "nuw" gets picked up in
the analysis of the other copy of the loop, which also gets "nuw". The
loop quickly gets turned into an infinite one because it's now riddled
with UB.

See test.c, compiled for a 32-bit platform (I've checked i686 and ARM,
mac & Linux) at -O3. Strangely I can't reproduce it in opt yet. I
suspect multiple runs of the same pass are needed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: application/octet-stream
Size: 2224 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190917/b03b7e14/attachment.obj>