[PATCH] D38948: [LV] Support efficient vectorization of an induction with redundant casts

Tue Nov 28 01:55:06 PST 2017

dorit updated this revision to Diff 124530.
dorit added a comment.

Hi Silviu,

The new version I uploaded has two main changes:

1. It fixes (what I think may be) a bug in createAddRecFromPHIWithCastsImpl(), where we add the equality predicate for the unsigned case: The IncrementNUSW overflow predicate that we add in this function allows the rewriter to rewrite this:

(zext i8 {0, + , (trunc i32 step to i8)} to i32)
into
{0, +, (sext i8 (trunc i32 step to i8) to i32)}
But the Equal predicate that we add is:
%step == (zext i8 (trunc i32 %step to i8) to i32).
So the fix changes the Equal predicate to:%step == (sext i8 (trunc i32 %step to i8) to i32)
(even for the unsigned case).

2. It changes the search for the IR cast-sequence in the spirit of what you proposed: It is moved to LoopUtils.cpp, and it relies on the SCEV of an instruction to be equal (***) to the SCEV of the phi.

I think it may not be entirely as general as you may have envisioned, but generalizing the implementation even further comes with some cost (complexity)which I am not sure that the current limited support justifies. Even the other SCEV patterns that I've seen (which I listed under the TODO of createAddRecFromPHIWithCastsImpl()) would be covered by the current implementatin, so I wouldn't like to over generalize at this point. In any case, this implementation is much more easily extendable than the previous pattern-based approach,
I hope this is close enough to what you had in mind…

(***) For the scev equality, I added the "areAddRecsEqualWithPreds()" utility, which considers the Equal predicates, because the rewriter did not rewrite this:
{0, +, (sext i8 (trunc i32 to i8) to i32)}
into{0, +, %step}.
We could instead extend the rewriter: in the "Sext/Zext" case it would have to check if the SCEV expr at hand confirms to the pattern (ext ix (trunc iy %step to ix) to iy), 
and if so, to look for any equality predicates of the form: 
(ext ix (trunc iy %step to ix) to iy) == %step.  
(basically to call "areAddRecsEqualWithPreds()" there, instead of in the LoopUtils utility.
Do you think we should do that?

Many thanks,
Dorit

https://reviews.llvm.org/D38948

Files:
  include/llvm/Analysis/ScalarEvolution.h
  include/llvm/Transforms/Utils/LoopUtils.h
  lib/Analysis/ScalarEvolution.cpp
  lib/Transforms/Utils/LoopUtils.cpp
  lib/Transforms/Vectorize/LoopVectorize.cpp
  test/Transforms/LoopVectorize/vect-phiscev-sext-trunc.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D38948.124530.patch
Type: text/x-patch
Size: 32924 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171128/627e38ae/attachment.bin>