[PATCH] D41953: [LoopUnroll] Unroll and Jam

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 20 04:11:31 PDT 2018


dmgreen marked 5 inline comments as done.
dmgreen added a comment.

Thanks for the comments! I'll try to split a few things out and update this accordingly



================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:427
+    bool UnrollAndJam;
+    /// Threshold for unroll and jam, for inner loop size
+    unsigned UnrollAndJamInnerLoopThreshold;
----------------
SjoerdMeijer wrote:
> SjoerdMeijer wrote:
> > Can you be a bit more explicit here what the threshold exactly is?
> typo: trashold :)
trashold?


================
Comment at: lib/Analysis/DependenceAnalysis.cpp:110
 static cl::opt<bool>
-Delinearize("da-delinearize", cl::init(false), cl::Hidden, cl::ZeroOrMore,
+Delinearize("da-delinearize", cl::init(true), cl::Hidden, cl::ZeroOrMore,
             cl::desc("Try to delinearize array references."));
----------------
SjoerdMeijer wrote:
> Will other people get unhappy when we flip this switch?
Oh yep, this should probably be a separate change.

I don't think people will be unhappy, but it shouldn't be lumped in here.


================
Comment at: lib/Target/ARM/ARMTargetTransformInfo.cpp:626
+  UP.UnrollAndJam = true;
+  UP.UnrollAndJamInnerLoopThreshold = 60;
 
----------------
SjoerdMeijer wrote:
> Some comments perhaps why it is set to 60?
The limit here is set because the inner loop requires a smaller threshold than the outer, as unroll and jam can increase register pressure in the inner loop significantly. This was chosen semi-randomly to be a something around 20% of the normal threshold, to something that seemed to work in a few cases I tried. More performance testing may be needed to get something better.


================
Comment at: lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp:305
+  }
+  if (NumInlineCandidates != 0) {
+    DEBUG(dbgs() << "  Not unrolling loop with inlinable calls.\n");
----------------
SjoerdMeijer wrote:
> nit: don't need the "!= 0"
This is copied from loop unroll. I think it's clearer this way what it's testing. But I can change it if you are highly opinionated :)


================
Comment at: lib/Transforms/Utils/LoopUnrollAndJam.cpp:121
+
+/*
+  This method performs Unroll and Jam. For a simple loop like:
----------------
SjoerdMeijer wrote:
> Some more nitpicking here, about names.
> 
> Perhaps "Fore" can be confusing: is it another loop, a block? So something like this:
> 
>   for (i)
>     block[i, 0]
>     for (j)
>       block[j, 0]
>     end
>     block[i, 1]
>   end
>       
> becomes:
> 
>   for (i, i+=2)
>     block[i, 0]
>     block[i+1, 0]
>     for (j, j+=4)
>       block[j+0, 0]
>       block[j+1, 0]
>       block[j+2, 0]
>       block[j+3, 0]
>     end
>     block[i, 1]
>     block[i+1, 1]
>   end
> 
> Mentioning that the outer loop has been unrolled with a factor of 2, and the inner loop with an unroll factor of 4.
> 
> Not sure if the classic loopunroll and jam literature uses standard terminology here, but perhaps inner/outer loop rather than "Fore, Subloop and Aft".
> 
> Also, perhaps you want to mention some restrictions somewhere, if there are any. E.g., can this doubly nested loop occur in more deeply nested loop structure (and loopunroll and jam still trigger)?
I'm not sure about your loop here. It looks like it has inner loop unrolling, which UnJ doesn't do (that is left to the unroller after the outer loop has been unroll and jammed.

By "Fore" I was referring to blocks in the outer loop, but before the inner loop. "Aft" means blocks in the outer loop but after the inner loop. I made up the names, so am open to suggestions on better ones ;)

> Also, perhaps you want to mention some restrictions somewhere, if there are any. E.g., can this doubly nested loop occur in more deeply nested loop structure (and loopunroll and jam still trigger)?

That should work I believe. If it doesn't it's a bug :) But I doubt it will ever be a profitable thing to do. It used to be disabled in the profitability check, but we may have lost that when I was moving things over the the new pass. I'll have a look.

There are restrictions mentioned in the safety checks, in isSafeToUnrollAndJam.


================
Comment at: lib/Transforms/Utils/LoopUnrollAndJam.cpp:139
+
+  So the blocks are essentially outer unrolled and then fused ("jammed")
+  together into a single inner loop. This can increase speed when there are
----------------
SjoerdMeijer wrote:
> Blocks = outer loops?
> 
> So perhaps something along the lines of:
> 
> "So the outer loops are unrolled and the inner loops fused ("jammed")."
I have tried to rewrite this a little.


================
Comment at: lib/Transforms/Utils/LoopUnrollAndJam.cpp:196
+  if (CompletelyUnroll) {
+    DEBUG(dbgs() << "COMPLETELY UNROLL AND JAMMING loop %" << Header->getName()
+                 << " with trip count " << TripCount << "!\n");
----------------
SjoerdMeijer wrote:
> nit: don't need to capitalize this message?
This is copied from loop unroll


================
Comment at: lib/Transforms/Utils/LoopUnrollAndJam.cpp:210
+
+    DEBUG(dbgs() << "UNROLL AND JAMMING loop %" << Header->getName() << " by "
+                 << Count);
----------------
SjoerdMeijer wrote:
> same here?
Same


https://reviews.llvm.org/D41953





More information about the llvm-commits mailing list