[PATCH] D49281: [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.

Sun Aug 19 02:35:33 PDT 2018

dmgreen added inline comments.

================
Comment at: docs/LangRef.rst:5265-5266
 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
 too.)

----------------
Meinersbur wrote:
> @dmgreen  This directly contradicts the `nounroll_plus_unroll_and_jam` test case in `Transforms/LoopUnrollAndJam/pragma.ll`
The way this should be working at the moment is:
  If there is any unroll_and_jam metadata
    do that thing             (the user explicitly asked for a thing -> do it)
  if there is any unroll metadata
    disable unrollandjam      (leave the loop to the unroller)
  normal heiristics

I think with "but no ``llvm.loop.unroll_and_jam`` metadata", that is what this is saying. Correct me if I'm wrong and it's not working like this. Or feel free to update it if it's unclear. Or if you think this should work another way...? ;)

I originally invisioned unrollandjam as an extension to the unroll pass, so I sometimes see the two things as interrelated. If a user specifies loop.unroll.disable, they almost certainly wanted to disable all unrolling, not just that in the unroll pass.

================
Comment at: docs/LangRef.rst:5311

+'``llvm.loop.unroll_and_jam.followup_outer``' Metadata
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
----------------
Meinersbur wrote:
> hsaito wrote:
> > Is there an assumption of unroll_and_jam operating only on a double loop and/or a perfect loop? Technically speaking, we can unroll_and_jam a loop if we can legally outerloop-vectorize. So, there can be multiple inner loops. 
> There's still an outermost (unrolled) loop and an innermost (jammed) loop. We could also adds followups for middle loops.
> 
> If it is the naming that concerns you: Would you prefer `followup_unrolled` and `followup_jammed`?
I believe he meant this being unrolled and jammed:
  for i {
    for j
      A(i,j)
    for k
      B(i,k)
  }
This is not something we currently support as I didn't think it would ever be likely to be profitable. Users specifying metadata might change that. The pass could be able to be expanded to work on this (I think), but it's not something that it currently does.

================
Comment at: lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp:194

+  TransformationMode EnableMode = hasUnrollAndJamTransformation(L);
+  if (EnableMode & TM_Disable) {
----------------
Meinersbur wrote:
> Meinersbur wrote:
> > dmgreen wrote:
> > > This code will need rebasing. There is a check earlier that looks for disable metadata that could be replaced by this. Look for HasUnrollDisablePragma/HasUnrollAndJamDisablePragma. If the same was done for unrolling, I think that would remove the need for the IgnoreUser (although your comment about it is probably still true).
> > I'd push towards refectoring-out the common parts of `computeUnrollCount` used by LoopUnroll and LoopUnrollAndJam. Currently `computeUnrollCount` uses lots of settings meant for LoopUnroll (`llvm.loop.unroll.` metadata which should not exist anymore, OptimizationRemarkMissed specific to LoopUnroll, `-unroll-count`, `PartialThreshold`, handling of full unroll, loop peeling that UnrollAndJam does not support, being used in a single call by UnrollAndJam for two different things: determining `ExplicitUnroll` (i.e. is normal unroll is forced) and the unroll-and-jam count). It's hard to understand the subtleties between those codes.
> > 
> > I gave up at some point and added the `IgnoreUser` flag to make test cases pass.
> `IgnoreUser` exists for the `nounroll_plus_unroll_and_jam` in `LoopUnrollAndJam\pragma.ll`.
> 
> `llvm.loop.unroll.disable causes`hasUnrollTransformation` in `computeUnrollCount` to return `TM_Disable`. Unrolling inside `computeUnrollCount` is disabled setting the unroll factor to 0. UnrollAndJam then tries to use the that unroll factor.
> 
> 
> 
> 
Have you considered moving the "disable" check out of computeUnrollCount and into tryUnrollLoop, where the existing "HasUnrollDisablePragma" check is? Hopefully that could be replaced with the new method, much like HasUnrollAndJamDisablePragma has been, and would mean this computeUnrollCount function would just work as it used to.

Repository:
  rL LLVM

https://reviews.llvm.org/D49281