[PATCH] D38948: [LV] Support efficient vectorization of an induction with redundant casts
Dorit Nuzman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 22 14:03:16 PST 2017
dorit added a comment.
Hi Silviu,
> Right... I think the SCEV rewriter isn't replacing w_ix.011 with {0, +, step} (not taking into account the equals predicate).
a couple updates:
1. I think mixed up things a bit in what I wrote above; the situation is as follows:
> Ideally we would be making the following transformations:
> (sext i32 (trunc i64 %w_ix.011 to i32) to i64) -> (equals predicate, replace %w_ix)
> (sext i32 (trunc i64 {0, +, step} to i32) to i64) -> (fold trunc)
> (sext i32 {0, + trunc i64 step to i32} to i64) -> (no overflow predicate)
> {0, +, sext i32 (trunc i64 step to i32)}
Up to here everything works as expected, and the expression returned for PSE.getSCEV(V2) is {0, +, sext i32 (trunc i64 step to i32)}.
But now, when we check if this SCEV is equal to {0, +, step}, we failed because my equality check was not looking at the predicates... When considering the equality predicates all is well.
BTW, I didn't find a PSCEV utility that checks for equality of SCEV expressions taking equality predicates into account… did I miss anything? (here we have two AddRecs to compare, so a Start1,Start2 to compare and a Step1,Step2 to compare, so in what I wrote I'm looking for any EqualPredicates whose LHS=Step1 and RHS=Step2, or LHS=Step2 and RHS=Step1, and the same for the Start exprs… makes sense?? (just feels like a lot of work…))
2. There was another bit of a hurdle in the unsigned version of this test (doit3 in the testcase):
The IR pattern is the following:
V0: %p.09 = phi (0, %add)
V1: %conv = and i32 %p.09, 255
V2: %add = add nsw i32 %conv, %step
And we have:
PSCEV.getScev(V0) = {0,+,%step}
PSCEV.getScev(V2) = {0,+,(sext i8 (trunc i32 %step to i8) to i32)}
And we are not able to deduce equality of the above because the Equal Predicate that we add in createAddRecFromPHIWithCasts() is:
%step == (zext i8 (trunc i32 %step to i8) to i32)
I guess that's a bug in the Predicate creation (?) in createAddRecFromPHIWithCastsImpl():
"AccumExtended = GetExtendedExpr(Accum)" creates the extended expression with zext, whereas probably the step part should be extended using sext because the overflow check that we add is IncrementNUSW…? Does that make sense?
(when I fix that, then the entire test passes).
Thanks,
Dorit
https://reviews.llvm.org/D38948
More information about the llvm-commits
mailing list