[PATCH] D23075: [IndVarSimplify] Extend trip count instead of truncating IV in LFTR, when original IV does not overflow

Wed Aug 3 16:01:02 PDT 2016

sanjoy added inline comments.

================
Comment at: lib/Transforms/Scalar/IndVarSimplify.cpp:104
@@ +103,3 @@
+
+typedef ValueMap<Value *, std::unique_ptr<OriginalIVInfo>> WidenedIVMapType;
+
----------------
amehsan wrote:
> sanjoy wrote:
> > Why not have the LHS be an `AssertingVH`?
> > 
> > Actually, will keeping a raw pointer (or an `AssertingVH`) work if an IV is widened twice?
> My understanding is that `ValueMap` handles Value pointers properly if the value is deleted or RAUW'ed. I am not entirely sure that the default behavior when an IV is widened twice is what we actually want, but playing with a number of simple loops, I cannot generate an example where IV is widened twice. 
> 
> I think if we come across such a loop and we care about it, what we need to do is to change the default behavior of `ValueMap` for RAUW (and/or delete).  The main thing to worry about will be what is the `OriginalType` that we store for it. So if we actually have a case that this has to be handled, then we do not need to change this data structure. We may need to customize the `ValueMapConfig`.
Sorry, this is my fault.  I mistook the `ValueMap` for a `DenseMap`.  I think what you have here is fine for now, modulo my comments on not keeping any state at all.

================
Comment at: lib/Transforms/Scalar/IndVarSimplify.cpp:2023
@@ +2022,3 @@
+        }
+        if (Iter->second->HasNUW && !Iter->second->IsSigned) {
+          extended = true;
----------------
amehsan wrote:
> amehsan wrote:
> > sanjoy wrote:
> > > I think there is a way to make this stateless (though I don't know if SCEV is smart enough to make that work viably for all the cases you care about): you were going to emit a backedge condition of `trunc(IV) == ExitCnt`.  (Please double check this logic) since `sext` and `zext` are one-to-one, this is equivalent to both `zext(trunc(IV)) == zext(ExitCnt)` and `sext(trunc(IV)) == sext(ExitCnt)`.
> > > 
> > > `zext(trunc(IV)) == zext(ExitCnt)` is profitable if `zext(trunc(IV))` == `IV` (in which case the RHS is `zext(ExitCnt)`) and `sext(trunc(IV)) == sext(ExitCnt)` is profitable if `sext(trunc(IV))` == `IV` (in which case the RHS is `sext(ExitCnt)`).  You can check these equivalences (i.e. `exit(trunc(T))` == `T`) using SCEV.
> > > 
> > > I think this will be a better solution overall, if it works. :)
> > I agree this is better. Will try it to see if it works.
> IIUC we cannot make it entirely stateless. If I have an unsigned  i32 IV, sext(trunc(IV)) != IV. (think of when the IV goes above max signed i32).   So at least I need to record the value of IsSigned and then we need the ValueMap. It would have been nice if we could just ask SCEV to do the analysis for us here to decide whether we should trunc iv or extend trip count.
> 
> Another question that I have about the suggested approach is compile time. Is it possible that we have cases for which SCEV analsysis is expensive and so asking for another SCEV analysis is something that we prefer to avoid?
> IIUC we cannot make it entirely stateless. If I have an unsigned i32 IV, sext(trunc(IV)) != IV. (think of when the IV goes above max signed i32). So at least I need to record the value of IsSigned and then we need the ValueMap. It would have been nice if we could just ask SCEV to do the analysis for us here to decide whether we should trunc iv or extend trip count.

I didn't mean to say that we can unconditionally assume `sext(trunc(IV)) == IV` -- we'll have to ask SCEV to re-prove that, like:

```
A = getSCEV(IV);
B = getSext(getTrunc(A))
if (A == B) then signed;
C = getZext(getTrunc(A))
if (A == C) then unsigned;
```

The nice thing here is that the optimization becomes orthogonal to the "history" of what indvars did.  Otherwise we end up in odd situations where indvars can simplify a certain pattern if *it* generated it, but can't simplify it if the exact same IR pattern came directly from the user.  Does this make sense?

> Another question that I have about the suggested approach is compile time. Is it possible that we have cases for which SCEV analsysis is expensive and so asking for another SCEV analysis is something that we prefer to avoid?

Yes, there is a potential compile-time issue here.  I don't think it should be material though (and if it is, my first action would be try to make SCEV faster, not avoid doing the more right thing in indvars).

https://reviews.llvm.org/D23075