[PATCH] D43594: [AMDGPU] Respect pragma unroll when loop contains convergent instructions

Thu Feb 22 07:33:04 PST 2018

yaxunl added inline comments.

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:426
+    /// Allow unrolling convergent loop with remainder.
+    bool AllowRemainderForConvergentLoop;
   };
----------------
efriedma wrote:
> I don't like sticking this here.
> 
> From your description, it sounds like it's a *correctness* property of the target, whether or not certain transforms which duplicate convergent operations are allowed.  In that case, it's not really about unrolling at all; it could apply to other transforms which clone code.  So at the very least, this should be a separate hook, with a clear explanation of exactly which transforms this allows.
For this specific transform (adding remainder for unrolling loop) we know that it will not cause extra divergence on amdgcn target. However this is something not easily applied to other cases. So far I do not see how it can be applied to other transformations.

https://reviews.llvm.org/D43594