[PATCH] D89566: [LV] Epilogue Vectorization with Optimal Control Flow
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 25 13:34:06 PST 2020
fhahn added inline comments.
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1024
+ /// *epilogue loop* strategy (ie the second pass of vplan execution).
+ BasicBlock *createEpilogueVectorizedLoopSkeleton() final override;
> fhahn wrote:
> > IIUC we specialize LV so we can override just `createEpilogueVectorizedLoopSkeleton` and `emitMinimumVectorEpilogueIterCountCheck` (which is only used by `createEpilogueVectorizedLoopSkeleton`.
> > Could we do a more targeted specialization, e.g. by extracting the skeleton creation code into something like `LoopSekletonCreator` and have a regular and epilogue version of that? Or combine it with `EpilogueLoopVectorizationInfo`?
> Currently only the skeleton code is specialized, but future enhancements will likely necessitate more specialization (eg to support widened induction vars and live-out phis). We could create a `LoopSkeletonCreator`, and modify the `LoopVectorizationPlanner` interfaces to work with it, but I think extending from the `InnerLoopVectorizer` fits better with the current design. Note that `InnerLoopUnroller` also takes a similar approach and extends `InnerLoopVectorizer` while only overriding a couple of functions.
> Currently only the skeleton code is specialized, but future enhancements will likely necessitate more specialization (eg to support widened induction vars and live-out phis).
I thought a bit on how to support those cases. IIUC the problem here is mostly setting up the right incoming values for the PHIs in the epilogue loop. Given that this is related to codegen, I *think* we should be able to represent all the required info in the VPlan, i.e. we should be able to adjust the VPlan of the epilogue vector loop to use the correct start values naturally.
> We could create a LoopSkeletonCreator, and modify the LoopVectorizationPlanner interfaces to work with it, but I think extending from the InnerLoopVectorizer fits better with the current design. Note that InnerLoopUnroller also takes a similar approach and extends InnerLoopVectorizer while only overriding a couple of functions.
My concern here is that this approach invites more modifications in the sub-classes, instead of modeling the required information directly in VPlan. I think this will lead to more work down the road, as we move more pieces into VPlan. I don't think `InnerLoopUnroller` is an ideal example here, because IIRC it was created before VPlan. I think working on modeling the incoming values in VPlan would be better aligned with the long term goals and lead to a more modular solution overall.
To illustrate this approach, I went ahead and tried to sketch support for widened induction using VPlan: D92132. It is a bit rough around the edges, but should give an idea of the required pieces. I'm working on a similar patch for reductions as well and we should be able to handle first-order recurrences in a similar fashion.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the llvm-commits