[PATCH] D38948: [LV] Support efficient vectorization of an induction with redundant casts

Wed Nov 22 14:03:16 PST 2017

dorit added a comment.

Hi Silviu,

> Right... I think the SCEV rewriter isn't replacing w_ix.011 with {0, +, step} (not taking into account the equals predicate).

a couple updates:

1. I think mixed up things a bit in what I wrote above; the situation is as follows:

> Ideally we would be making the following transformations:
>  (sext i32 (trunc i64 %w_ix.011 to i32) to i64) -> (equals predicate, replace %w_ix)
>  (sext i32 (trunc i64 {0, +, step} to i32) to i64) -> (fold trunc)
>  (sext i32 {0, + trunc i64 step to i32} to i64) -> (no overflow predicate)
>  {0, +, sext i32 (trunc i64 step to i32)}

Up to here everything works as expected,  and the expression returned for PSE.getSCEV(V2) is  {0, +, sext i32 (trunc i64 step to i32)}. 
But now, when we check if this SCEV is equal to  {0, +, step}, we failed because my equality check was not looking at the predicates...  When considering the equality predicates all is well.

BTW, I didn't find a PSCEV utility that checks for equality of SCEV expressions taking equality predicates into account… did I miss anything? (here we have two AddRecs to compare, so a Start1,Start2 to compare and a Step1,Step2 to compare, so in what I wrote I'm looking for any EqualPredicates whose LHS=Step1 and RHS=Step2, or LHS=Step2 and RHS=Step1, and the same for the Start exprs… makes sense?? (just feels like a lot of work…))

2. There was another bit of a hurdle in the unsigned version of this test (doit3 in the testcase):

The IR pattern is the following:

V0: %p.09 = phi (0, %add)
V1: %conv = and i32 %p.09, 255
V2: %add = add nsw i32 %conv, %step

And we have:
PSCEV.getScev(V0) = {0,+,%step}
PSCEV.getScev(V2) = {0,+,(sext i8 (trunc i32 %step to i8) to i32)}

And we are not able to deduce equality of the above because the Equal Predicate that we add in createAddRecFromPHIWithCasts() is:
%step == (zext i8 (trunc i32 %step to i8) to i32)

I guess that's a bug in the Predicate creation (?) in createAddRecFromPHIWithCastsImpl():
"AccumExtended = GetExtendedExpr(Accum)" creates the extended expression with zext, whereas probably the step part should be extended using sext because the overflow check that we add is IncrementNUSW…? Does that make sense?

(when I fix that, then the entire test passes).

Thanks,
Dorit

https://reviews.llvm.org/D38948