[PATCH] D23075: [IndVarSimplify] Extend trip count instead of truncating IV in LFTR, when original IV does not overflow

Ehsan Amiri via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 4 11:18:26 PDT 2016


amehsan added inline comments.

================
Comment at: lib/Transforms/Scalar/IndVarSimplify.cpp:2023
@@ +2022,3 @@
+        }
+        if (Iter->second->HasNUW && !Iter->second->IsSigned) {
+          extended = true;
----------------
amehsan wrote:
> sanjoy wrote:
> > amehsan wrote:
> > > amehsan wrote:
> > > > sanjoy wrote:
> > > > > I think there is a way to make this stateless (though I don't know if SCEV is smart enough to make that work viably for all the cases you care about): you were going to emit a backedge condition of `trunc(IV) == ExitCnt`.  (Please double check this logic) since `sext` and `zext` are one-to-one, this is equivalent to both `zext(trunc(IV)) == zext(ExitCnt)` and `sext(trunc(IV)) == sext(ExitCnt)`.
> > > > > 
> > > > > `zext(trunc(IV)) == zext(ExitCnt)` is profitable if `zext(trunc(IV))` == `IV` (in which case the RHS is `zext(ExitCnt)`) and `sext(trunc(IV)) == sext(ExitCnt)` is profitable if `sext(trunc(IV))` == `IV` (in which case the RHS is `sext(ExitCnt)`).  You can check these equivalences (i.e. `exit(trunc(T))` == `T`) using SCEV.
> > > > > 
> > > > > I think this will be a better solution overall, if it works. :)
> > > > I agree this is better. Will try it to see if it works.
> > > IIUC we cannot make it entirely stateless. If I have an unsigned  i32 IV, sext(trunc(IV)) != IV. (think of when the IV goes above max signed i32).   So at least I need to record the value of IsSigned and then we need the ValueMap. It would have been nice if we could just ask SCEV to do the analysis for us here to decide whether we should trunc iv or extend trip count.
> > > 
> > > Another question that I have about the suggested approach is compile time. Is it possible that we have cases for which SCEV analsysis is expensive and so asking for another SCEV analysis is something that we prefer to avoid?
> > > IIUC we cannot make it entirely stateless. If I have an unsigned i32 IV, sext(trunc(IV)) != IV. (think of when the IV goes above max signed i32). So at least I need to record the value of IsSigned and then we need the ValueMap. It would have been nice if we could just ask SCEV to do the analysis for us here to decide whether we should trunc iv or extend trip count.
> > 
> > I didn't mean to say that we can unconditionally assume `sext(trunc(IV)) == IV` -- we'll have to ask SCEV to re-prove that, like:
> > 
> > ```
> > A = getSCEV(IV);
> > B = getSext(getTrunc(A))
> > if (A == B) then signed;
> > C = getZext(getTrunc(A))
> > if (A == C) then unsigned;
> > ```
> > 
> > The nice thing here is that the optimization becomes orthogonal to the "history" of what indvars did.  Otherwise we end up in odd situations where indvars can simplify a certain pattern if *it* generated it, but can't simplify it if the exact same IR pattern came directly from the user.  Does this make sense?
> > 
> > > Another question that I have about the suggested approach is compile time. Is it possible that we have cases for which SCEV analsysis is expensive and so asking for another SCEV analysis is something that we prefer to avoid?
> > 
> > Yes, there is a potential compile-time issue here.  I don't think it should be material though (and if it is, my first action would be try to make SCEV faster, not avoid doing the more right thing in indvars).
> I noticed this before, but I misinterpreted something. The right solution might be something else. After `simplifyAndExtend` the loop looks exactly as we want.
> 
> ```
> for.body.preheader:                               ; preds = %entry
>   %0 = zext i32 %m to i64
> 
> for.body:                                         ; preds = %for.body.preheader, %for.body
>   %indvars.iv = phi i64 [ 500, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
>   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>   %inc = add nuw i32 %i.011, 1
>   %cmp = icmp ult i64 %indvars.iv.next, %0
> 
> ```
> 
> One simple problem is that in `IndVarSimplify::run`  the following call 
> 
> ```
> const SCEV *BackedgeTakenCount = SE->getBackedgeTakenCount(L);
> ```
> 
> is before `simplifyAndExtend`. This order can change, but even when I do that the BackEdgeCount returned by SCEV is 'i32 %m'. What is the reason for that? Maybe we need to make sure the BackEdgeCount is an i64 expression (here %0) and then the trunc will not be generated.
Never mind. That is due to caching.


https://reviews.llvm.org/D23075





More information about the llvm-commits mailing list