[PATCH] D42447: [LV][VPlan] Detect outer loops for explicit vectorization.

Thu Apr 5 10:57:15 PDT 2018

dcaballe added a comment.

Thanks for your comments, Florian and Renato!
More comments inline.

Diego.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:8642
+
+     // TODO: It will be set to true when we introduce codegen support.
+     return false;
----------------
fhahn wrote:
> rengolin wrote:
> > hsaito wrote:
> > > dcaballe wrote:
> > > > rengolin wrote:
> > > > > I'm really uncomfortable with all these temporary code blocks that don't do anything...
> > > > > 
> > > > > They're really just hijacking the existing infrastructure without implementing as a VPlan.
> > > > > 
> > > > > I really thought the whole point of VPlans was that you wouldn't need to hack-it-up like we used to in the old vectoriser...
> > > > This is the entrance to the VPlan-native vectorization path. It's not doing anything yet because we are trying to follow an incremental approach by releasing relatively small patches that are easy to digest. This code will be functional (generating vector code) soon.
> > > > 
> > > > The code block is temporary as long as both vectorization paths co-exist but the final goal is to converge into a single one. This approach will allow us to incrementally and easily extend all the current inner loop vectorization functionality to support outer loops and, most importantly, doing so without destabilizing inner loop vectorization. We are really concerned about the latter and we think that this approach is a reasonable trade-off between safety and temporary code blocks.
> > > > 
> > > > If you want to discuss this further, I would recommend to move the discussion to the RFC thread so that everybody is aware of it: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html
> > > > 
> > > I'm working on the "converge into a singe one" side. At this point, I'm taking care of the ground work of moving the right things to the right places such that I don't have to include those "almost NFC" things as part of "expand VPlan's participation into innermost loop vectorization". Thank you for helping me do that with your reviews. We need to be able to build VPlan for the innermost loop vectorization right after Legal, for example, before we can remove the diverged code path at the beginning. In the meantime, the outer loop vectorization patch series will help people realize how much common things are there between innermost loop vectorization and outer loop vectorization, and more importantly, help people think how to write code that can work in both ways. 
> > > That's as much as I want to write about the approach we are taking, within this patch review. The rest of the discussions should happen on the above mentioned RFC. Thanks.
> > Ok, as above, just remove the comments and add a two-line comment summarising it.
> I am also slightly worried that people will come along and see this code and think that cost modelling and planning already works for outer loops, as it is used in the VPlan native path. But I think the comment makes it clear now.
> 
> I am not sure if it would be clearer/nicer to have clearer separation by having the code in separate functions rather than adding even more code to those already huge functions.
> I am not sure if it would be clearer/nicer to have clearer separation by having the code in separate functions rather than adding even more code to those already huge functions.

Agreed, I could move these code to a separate function.

>rather than adding even more code to those already huge functions.

Are you talking about only this function or also some other ones?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1661
+  bool canVectorizeLoopNestCFG(Loop *Lp);
+  /// Helper function to canVectorizeLoopNestCFG that returns true if the
+  /// pre-header, exiting and latch blocks of \p Lp (non-recursive) are
----------------
fhahn wrote:
> Maybe add a newline to separate the 2 functions. Not sure if calling it out as helper function is necessary. In a way, most functions here are helper functions :)
Sounds good! Thanks!

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5119
+// related to the loop latch because they don't affect the loop uniformity.
+static bool isUniformLoop(Loop *Lp, Loop *OuterLp) {
+  assert(Lp->getLoopLatch() && "Expected loop with a single latch.");
----------------
fhahn wrote:
> Work done here is potentially done multiple times for each loop, right? E.g. for deep loop nests, this will be called multiple times for the same Lp, but with different outer loops. 
> 
> Only a few checks here depend on the outer loop and I think ideally we would not check the same things again and again. For now those redundant checks are quite simple, but I think we should keep that issue in mind once we introduce more complex checks.
Good point. `OuterLp` will be fixed, at least in the short term while we only support explicit vectorization. Given that we are introducing support for divergent inner loops in the patch series #4, it's more likely that we don't need this function (or at least this function as is) before we introduce the engine to evaluate different outer loops.

In any case, the proper inner loop uniformity check will depend on the outer loop we are vectorizing. Some of these "extra" checks are very specific for the patch series #1, where the supported loops are very limited. They will be progressively removed, leaving only the `OuterLp` dependent checks.

For these reasons, IMO, it makes sense to keep all these checks and the documentation together. I think it's easier to understand which inner loops are currently supported. I could add a comment explaining you concerns. However, I could try to split them if you think this is not enough.

Please, let me know what you think.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5128
+  // 1.
+  PHINode *IV = Lp->getCanonicalInductionVariable();
+  if (!IV) {
----------------
fhahn wrote:
> I think the use of getCanonicalInductionVariable is discouraged. I think it would be better to detect induction variables using SCEV, as done LoopVectorizeLegality.
Could you please elaborate a bit more? Why is it discouraged? I can't find any comments in the source code.

We are trying to introduce some restrictive but simple checks. If the answer is that this interface is discouraged because it may not detect some IVs that that are canonical, that would be perfectly fine. I'm also looking at the `LoopVectorizationLegality::addInductionPhi`. Isn't this function doing something similar to `getCanonicalInductionVariable` to detect the primary induction but using `InductionDescriptor`?

https://reviews.llvm.org/D42447