[PATCH] D49281: [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.

Mon Aug 20 10:16:21 PDT 2018

Meinersbur added inline comments.

================
Comment at: docs/LangRef.rst:5265-5266
 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
 too.)

----------------
dmgreen wrote:
> Meinersbur wrote:
> > @dmgreen  This directly contradicts the `nounroll_plus_unroll_and_jam` test case in `Transforms/LoopUnrollAndJam/pragma.ll`
> The way this should be working at the moment is:
>   If there is any unroll_and_jam metadata
>     do that thing             (the user explicitly asked for a thing -> do it)
>   if there is any unroll metadata
>     disable unrollandjam      (leave the loop to the unroller)
>   normal heiristics
> 
> I think with "but no ``llvm.loop.unroll_and_jam`` metadata", that is what this is saying. Correct me if I'm wrong and it's not working like this. Or feel free to update it if it's unclear. Or if you think this should work another way...? ;)
> 
> I originally invisioned unrollandjam as an extension to the unroll pass, so I sometimes see the two things as interrelated. If a user specifies loop.unroll.disable, they almost certainly wanted to disable all unrolling, not just that in the unroll pass.
```
If there is any unroll_and_jam metadata
  do that thing             (the user explicitly asked for a thing -> do it)
```
There is a comment in LoopUnrollAndJamPass.cpp:
```
  // We have already checked that the loop has no unroll.* pragmas.
```
According to this, this is not true (since only checked afterwards) and `computeUnrollCount` will consider e.g. `llvm.loop.unroll.count` even when used for unroll-and-jam.

I was concerned about the last phrase `plus llvm.loop.unroll.disable metadata will disable unroll and jam too.`, but it might be a misinterpretation in that it will disable unroll-and-jam, but only if unroll-and-jam is not explicitly enabled.

================
Comment at: docs/LangRef.rst:5311

+'``llvm.loop.unroll_and_jam.followup_outer``' Metadata
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
----------------
dmgreen wrote:
> Meinersbur wrote:
> > hsaito wrote:
> > > Is there an assumption of unroll_and_jam operating only on a double loop and/or a perfect loop? Technically speaking, we can unroll_and_jam a loop if we can legally outerloop-vectorize. So, there can be multiple inner loops. 
> > There's still an outermost (unrolled) loop and an innermost (jammed) loop. We could also adds followups for middle loops.
> > 
> > If it is the naming that concerns you: Would you prefer `followup_unrolled` and `followup_jammed`?
> I believe he meant this being unrolled and jammed:
>   for i {
>     for j
>       A(i,j)
>     for k
>       B(i,k)
>   }
> This is not something we currently support as I didn't think it would ever be likely to be profitable. Users specifying metadata might change that. The pass could be able to be expanded to work on this (I think), but it's not something that it currently does.
The j- and k-loops are both inner loops, so `followup_inner` should apply to both of them.
distinguishing them might be possible when introducing a mechanism like for naming the output loops of loop distribution.

================
Comment at: lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp:194

+  TransformationMode EnableMode = hasUnrollAndJamTransformation(L);
+  if (EnableMode & TM_Disable) {
----------------
dmgreen wrote:
> Meinersbur wrote:
> > Meinersbur wrote:
> > > dmgreen wrote:
> > > > This code will need rebasing. There is a check earlier that looks for disable metadata that could be replaced by this. Look for HasUnrollDisablePragma/HasUnrollAndJamDisablePragma. If the same was done for unrolling, I think that would remove the need for the IgnoreUser (although your comment about it is probably still true).
> > > I'd push towards refectoring-out the common parts of `computeUnrollCount` used by LoopUnroll and LoopUnrollAndJam. Currently `computeUnrollCount` uses lots of settings meant for LoopUnroll (`llvm.loop.unroll.` metadata which should not exist anymore, OptimizationRemarkMissed specific to LoopUnroll, `-unroll-count`, `PartialThreshold`, handling of full unroll, loop peeling that UnrollAndJam does not support, being used in a single call by UnrollAndJam for two different things: determining `ExplicitUnroll` (i.e. is normal unroll is forced) and the unroll-and-jam count). It's hard to understand the subtleties between those codes.
> > > 
> > > I gave up at some point and added the `IgnoreUser` flag to make test cases pass.
> > `IgnoreUser` exists for the `nounroll_plus_unroll_and_jam` in `LoopUnrollAndJam\pragma.ll`.
> > 
> > `llvm.loop.unroll.disable causes`hasUnrollTransformation` in `computeUnrollCount` to return `TM_Disable`. Unrolling inside `computeUnrollCount` is disabled setting the unroll factor to 0. UnrollAndJam then tries to use the that unroll factor.
> > 
> > 
> > 
> > 
> Have you considered moving the "disable" check out of computeUnrollCount and into tryUnrollLoop, where the existing "HasUnrollDisablePragma" check is? Hopefully that could be replaced with the new method, much like HasUnrollAndJamDisablePragma has been, and would mean this computeUnrollCount function would just work as it used to.
There are multiple mechanisms in `computeUnrollCount` that disable unrolling (such as `UnrollCountPragmaValue` returning zero).

If I was to fix this method, I'd do it cleanly by refactoring-out the mechanism that computes the unroll factor when pragmas/options are absent (and not emit any LoopUnroll-specific diagnostics).

Repository:
  rL LLVM

https://reviews.llvm.org/D49281