[PATCH] D28975: [LV] Introducing VPlan to model the vectorized code and drive its transformation
Michael Kuperstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 26 16:50:10 PST 2017
mkuper added a subscriber: chandlerc.
mkuper added a comment.
Thanks, Ayal, Gil!
Two meta-comments about the review:
1. I suggest we use this review for higher-level points and issues with the design. If there are no significant objections, we can then review the code piece-by-piece, split into the chunks the review comment suggests. This review can then be used as a reference to how those hunks fit together, if the purpose of some of them is unclear, as a stand-alone patch. Does this make sense to everyone?
2. I think this is major enough to give a heads-up to llvm-dev, and give more people opportunity to take a look at the design.
Having said that, I think this is generally the right way forward. The way legal/cost/transformation interact in the current LV wasn't as much designed, as evolved, and not necessarily in the right direction...
Some preliminary comments from a first pass inline. I still need to understand how the recipes are going to work.
================
Comment at: docs/VectorizationPlan.rst:10
+
+- be a *lightweight* NFC patch;
+- show key aspects of VPlan's Hierarchical CFG concept;
----------------
As @chandlerc pointed out to me, this isn't actually NFC. It's expected not to change the output of the vectorizer, but it's a huge change in how the vectorizer *functions*.
================
Comment at: docs/VectorizationPlan.rst:27
+
+ * aligning Cost step with Transformation step,
+ * representing entire code being transformed,
----------------
By this do you mean that making a cost decision will "record" the transformation we expect to make?
================
Comment at: docs/VectorizationPlan.rst:36
+
+Out of scope for initial patch:
+
----------------
This sounds reasonable to me.
I assume this is in sync with what Elena is doing in, say, D27919, right?
================
Comment at: docs/VectorizationPlan.rst:60
+
+2. A LoopVectorizationPlanner can construct, optimize and discard one or more
+ VPlans, providing different ways to vectorize the loop or the loop nest.
----------------
How is "can construct, optimize and discard one or more VPlans" different from just "generates VPlans"?
(I'm trying to understand if this adds information I'm missing. :-) )
================
Comment at: docs/VectorizationPlan.rst:111
+
+1. Legal Step: check if loop can be legally vectorized; encode contraints and
+ artifacts if so. Initiate Vectorization Plan showing how the loop can be
----------------
contraints -> constraints.
Also, I understand what constraints are in this context (e.g. dependence distance), but what do you mean by artifacts?
================
Comment at: docs/VectorizationPlan.rst:131
+
+One can compare with LLVM's existing SLP vectorizer, where TSLP [3]_ adds
+Step 2.b.
----------------
Could you give a more concrete example for what optimizing an LV VPlan would mean?
I haven't read the TSLP paper (although I should), but skimming it, it looks like it's basically constructing a lot of subtrees (each of which could be considered, in our context, its own VPlan) and choose the best among those. So this isn't exactly "optimizing" a plan, it's just recursively generating plans and choosing the cheapest one.
Or should I just go and read the paper properly?
================
Comment at: docs/VectorizationPlan.rst:140
+OpenMP [6]_. This design to extend the Loop Vectorizer to outer loops supports
+and raises the importance of explicit vectorization beyond the current
+capabilities of Clang and LLVM. Namely, from currently forcing the
----------------
I don't believe anything in this design raises the importance of explicit outer loop vectorization.
I think what you're trying to say is that if we want to support explicit outer loop vectorization (which I believe we do), the best way to do it would be using this design. Am I right?
================
Comment at: docs/VectorizationPlan.rst:251
+ vectorized loop which may include multiple basic-blocks and nested loops.
+ b. SLP vectorization.
+ c. Combinations of the above, including nested vectorization: vectorizing
----------------
If this happens, it would be fantastic.
How would this design help support this?
================
Comment at: docs/VectorizationPlan.rst:271
+:VPlan:
+ A recipe for generating a vectorized version from a given IR code.
+ Takes a "scenario-based approach" to vectorization planning.
----------------
So, given the text below, by "a given IR code" you mean "a given IR SESE region", right?
================
Comment at: docs/VectorizationPlan.rst:281
+ basic-blocks as in Sharir [9]_, promoting the Tile encapsulation. We use the
+ terms Region and Block rather than Tile [8]_ to avoid confusion with loop
+ tiling.
----------------
Each Region is still required to be SESE, right?
================
Comment at: docs/VectorizationPlan.rst:287
+ instructions that will appear consecutively in a basic block of the vectorized
+ version. The instructions of such a basic block originate from one or more
+ VPBasicBlocks.
----------------
How can they originate from more than one basic block? Can you be more explicit about the relationship between IR basic blocks and VP basic blocks?
================
Comment at: docs/VectorizationPlan.rst:289
+ VPBasicBlocks.
+ The VPBasicBlock takes care of the control-flow
+ relations with other VPBasicBlock's and Regions.
----------------
Something here is garbled.
================
Comment at: docs/VectorizationPlan.rst:295
+ may happen, e.g., to support disjoint Regions and to ensure Regions have a
+ single exit, possibly an empty one.
+
----------------
What do you mean by supporting disjoint Regions?
================
Comment at: docs/VectorizationPlan.rst:387
+:VPTransformState:
+ Stores information used for code generation, passed from the Planner to its
+ selected VPlan for execution, and used to pass additional information down
----------------
What kind of information?
================
Comment at: docs/VectorizationPlan.rst:485
+ // Select the interleave count.
+ unsigned IC = CM.selectInterleaveCount(OptForSize, VF.Width, VF.Cost);
+
----------------
The long-term plan is to have LVP produce a VPlan that contains both UF and VF, and this is just a transitional state, right?
https://reviews.llvm.org/D28975
More information about the llvm-commits
mailing list