[PATCH] Use loop unrolling pragma metadata in the loop unroller

Mon Jun 16 01:24:34 PDT 2014

Thanks for your suggestions.  I think I've addressed all of them.

>>! In D4090#5, @pekka.jaaskelainen wrote:
> Great, this feature will be really useful. How about naming these metadatas using the llvm.loop. as a prefix? That is, llvm.loop.unroll.enable etc? Maybe it's worth adding a test case with a negative unroll count to ensure nothing bad happens. Otherwise LGTM.

I like the llvm.loop.unroll. prefix idea.  As mentioned in the clang patch thread, it makes sense to change the loop vectorizer metadata names as well.  I will do that in a followup patch.

I added an assert for the non-positive case because zero and negative unroll counts should never be emitted by clang (it emits an error).  As mentioned in my inline comments nonpositive unroll factors are possible by using a test-only command line opt and via a pass constructor arg, but that doesn't seem to be a significant problem.  Let me know if that's reasonable.

>>! In D4090#8, @reames wrote:
> Just to check, the semantics of the pragmas are best effort right?  They're not *required* to be respected for semantic correctness?  If they were, metadata would be the wrong implementation mechanism here.  

They are just optimization hints and are not required for correctness.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:181
@@ +180,3 @@
+             "Unroll hint metadata should have two operands.");
+      return dyn_cast<ConstantInt>(MD->getOperand(1));
+    }
----------------
Philip Reames wrote:
> Should this be a cast?  Is it ever expected to fail?
It should always be a ConstantInt.  Changed to cast.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:190
@@ +189,3 @@
+// or to false if pragma is the disable form.
+static bool HasUnrollEnablePragma(const Loop *L, bool &Enable) {
+  const ConstantInt *EnableValue =
----------------
Philip Reames wrote:
> I find the semantics of this function slightly confusing.  It might make sense to separate two helper functions: one for the enable case and one for the disable case.  They can share implementation if desired.
Looking at it again. It is confusing, indeed :-)  Split into two functions.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:267
@@ +266,3 @@
+  // constructor parameter) has highest precedence.
+  unsigned Count = UserCount ? CurrentCount : 0;
+
----------------
Philip Reames wrote:
> Can a user specified count be zero or negative?  This seems like one way to disable loop unrolling.  
> 
> This is not a new issue in the code you added, I just happened to notice it here.  
Yeah, the code currently doesn't deal with user-specified values <= 0 well.  The value of zero will essentially be ignored and the usual heuristics will apply.  If the value is negative it gets converted to unsigned which means a very large unrolling factor.  The cl opt is unsigned, so a negative value can only come from the constructor args.  Overall probably not optimal, but this nonpositive value badness is only exercisable via a test-only cl opt and bad args to the pass constructor so probably not a big issue.  Happy to fix it if you'd like though.

Regarding making values <=0 disable unrolling, there are other ways to disable unrolling (cl opts, programmatically, and with pragmas in the code) so it seems unnecessary to me.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:274
@@ +273,3 @@
+    bool HasEnablePragma = HasUnrollEnablePragma(L, Enable);
+    int PragmaCount;
+    bool HasCountPragma = HasUnrollCountPragma(L, PragmaCount);
----------------
Eli Bendersky wrote:
> Move this closer to use
Done.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:288
@@ +287,3 @@
+    } else if (HasEnablePragma) {
+      // Loop has unroll(enable) pragma without a unroll_count pragma,
+      // so unroll loop fully if possible.
----------------
Philip Reames wrote:
> Does it make sense to clamp this to something reasonable?  Unrolling a million iteration loop seems like a spectacularly bad idea and probably not actually the users intent.  
Makes sense.  Added a limit of 1024.  I only chose to only enforce this limit for the unroll(enable) case because this would be the most surprising case to the user.  If the user specified unroll_count(1000000), well, then they've knowingly shot themselves in the foot.  Lemme know if you think that's reasonable.  Also added a test for this case.

================
Comment at: test/Transforms/LoopUnroll/unroll-pragmas.ll:48
@@ +47,3 @@
+  %exitcond = icmp eq i64 %indvars.iv.next, 4
+  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !1
+
----------------
Eli Bendersky wrote:
> IMHO it would be more readable if you placed !1 right after this function. I don't think the mdnodes have to be necessarily collected in the bottom. Ditto for the rest
Done.

http://reviews.llvm.org/D4090