[PATCH] D89566: [LV] Epilogue Vectorization with Optimal Control Flow

Bardia Mahjour via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 3 13:59:09 PST 2020


bmahjour added a comment.

In D89566#2371597 <https://reviews.llvm.org/D89566#2371597>, @dmgreen wrote:

> I have not looked at any of the details here, but a very high level comment is that this isn't very VPlany. If we do want to push things in that direction, then it is at least worth thinking about how this and vplan will co-exist, even if vplan isn't ready for it yet. It would be great to get to a point where all this information is in the vplan and we can compare epilog remainders vs scalar remainders vs whatever else, and come up with a good total cost based on the estimated trip count. Just something to think about.

As it stands right now, VPlan can only model control-flow inside a loop. Since epilogue vectorization is concerned with control-flow around (and outside) the loop, there isn't much that can be done today to make the transformation "VPlany". I understand the ultimate goal of vplan is to also model the context around subject loops (eg the entire loop nest or other code surrounding loops). I wondered about whether it's worth to delay this work until that becomes available, but most of the feedback I received was in the direction of let's get it done now and then do it in vplan when it's capable of representing surrounding context.

> Also on a completely different note, I presume this could be expanded to handle predicated remainders too? So that a unpredicated loop was give a single predicated remainder iteration, as might be useful to SVE.

Absolutely, that can be a nice follow on to this work. I think any target that supports predicated vector instructions could benefit, specially if the predicated vector instructions perform better than scalar instructions but not as good as non-predicated vector instructions. If predicated and non-predicated vector instructions have similar throughput and latency, then perhaps tail-folding the main loop would be a better fit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89566/new/

https://reviews.llvm.org/D89566



More information about the llvm-commits mailing list