[PATCH] D43594: [AMDGPU] Respect pragma unroll when loop contains convergent instructions

Thu Feb 22 11:45:12 PST 2018

efriedma added a reviewer: jlebar.
efriedma added inline comments.

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:426
+    /// Allow unrolling convergent loop with remainder.
+    bool AllowRemainderForConvergentLoop;
   };
----------------
yaxunl wrote:
> efriedma wrote:
> > I don't like sticking this here.
> > 
> > From your description, it sounds like it's a *correctness* property of the target, whether or not certain transforms which duplicate convergent operations are allowed.  In that case, it's not really about unrolling at all; it could apply to other transforms which clone code.  So at the very least, this should be a separate hook, with a clear explanation of exactly which transforms this allows.
> For this specific transform (adding remainder for unrolling loop) we know that it will not cause extra divergence on amdgcn target. However this is something not easily applied to other cases. So far I do not see how it can be applied to other transformations.
The allowed transform can't be specifically "remainder loops produced by lib/Transforms/Utils/LoopUnrollRuntime.cpp in LLVM r324285"; there must be some set of similar transforms which are allowed (whether or not they're currently implemented in the current LLVM codebase).

================
Comment at: test/Transforms/LoopUnroll/convergent.ll:100
+  %exitcond = icmp eq i32 %inc, 4
+  br i1 %exitcond, label %exit, label %l3, !llvm.loop !1
+
----------------
I'm not sure this testcase really demonstrates what you want it to demonstrate... a trip count of 4 is divisible by an unroll count of 2, so you don't need a remainder loop anyway.

================
Comment at: test/Transforms/LoopUnroll/convergent.ll:108
 !0 = !{!0, !{!"llvm.loop.unroll.count", i32 16}}
+!1 = !{!0, !{!"llvm.loop.unroll.count", i32 2}}
----------------
Are you sure this metadata is correct?  It would be nice to have a separate test without the "convergent" marking to show it's making a difference.

https://reviews.llvm.org/D43594