[llvm] r273079 - [SCEV] Fix incorrect trip count computation

Sun Jun 19 15:21:01 PDT 2016

Hi Eli,

Eli Friedman wrote:
 > It seems like it should be possible to optimize nsw-tripcount.ll *somehow*... you're right that the NoWrap on the
 > induction variable is irrelevant, but it should be possible to use the NSW property of the RHS to come up with the right
 > conclusion.

You're right -- if we could model the nsw on the RHS inside SCEV then
we could compute a constant trip count for the loop (though, in that
case, we should not need to do anything special casing at all since
"smax(X,X nsw+ <Constant>)" should fold to "X" or "X nsw+ <Constant>"
depending on <Constant> anyway).

Unfortunately, modeling the nsw on the RHS is not a simple bugfix in
SCEV, but will require some major infrastructure changes.  SCEV keys
expressions on their arithmetic operands, not on their no-wrap flags.
No-wrap flags are added to SCEV expressions by mutating them.  This
means both "A + B" and "A nsw+ B" will map to the same SCEV*, so
transferring the nsw from the latter llvm::Instruction to the
corresponding llvm::SCEV* is not safe, because then we'll think
getSCEV(A + B) is NSW when it isn't.  The test case I removed above is
a place where this is a drawback (we can't exploit nsw as hard as we
should be able to), but "in some cases" (I have not yet quantified
this) it helps by making pointer equality more effective (i.e. we're
more easily able to fold "(A + B) + (A nsw+ B)" into "2 * (A + B)").
If we map (A+B) and (A nsw+ B) to different SCEV* then we can still
fold "(A + B) + (A nsw+ B)" into "2 * (A + B)", but it will be more
compile-time intensive, since we'll turn a pointer equality check into
some kind of structural equality check.

(This is work done by Bjarke Roune) In some cases, we can show that if
an operation overflows the poison value it produced will definitely
lead to undefined behavior, so we can transfer the nsw to SCEV*
expressions that are guaranteed to be control equivalent to the nsw
arithmetic.  But that logic does not apply in the deleted test case
since neither of the two preconditions apply.

All said and done, this is not the only place I've had to pessimize
SCEV because our infrastructure does not allow us to aggressively
exploit nsw/nuw flags (e.g see r271151), so I'm seriously thinking
about making some deeper infrastructural changes here.

And SCEV is still buggy around cases like:

```
define void @f() {
loop.entry:
   br label %loop

loop:
   %iv.nowrap = phi i32 [ 0, %loop.entry ],  [ %iv.nowrap.inc, %loop ]
   %iv.maywrap = phi i32 [ 0, %loop.entry ], [ %iv.maywrap.inc, %loop ]
   %iv.nowrap.inc  = add nsw i32 %iv.nowrap, 1
   %iv.maywrap.inc = add     i32 %iv.maywrap, 1
   br label %loop
}
```

where both `%iv.nowrap` and `%iv.maywrap` will be mapped to a
`<nuw><nsw>` SCEV.  This is problematic in theory (since `%iv.nowrap`
could be unused), but given that it is difficult to arrive at such a
situation from C/C++ code, I cannot justify doing the right thing here
and taking the (substantial) performance hit.

(This goes without saying, but if you have ideas here I'd love to hear
them!)

Thanks!
-- Sanjoy