[llvm-dev] question about unrolling loops with convergent instructions

Thu Jan 11 08:26:37 PST 2018

I have a loop with convergent instructions with a loop count of 1024. I use pragma to specify unroll count to be 32. However, the loop was unrolled by 512, which results in very long compilation time.

In tryToUnrollLoop, there is

  // If the loop contains a convergent operation, the prelude we'd add
  // to do the first few instructions before we hit the unrolled loop
  // is unsafe -- it adds a control-flow dependency to the convergent
  // operation.  Therefore restrict remainder loop (try unrollig without).
  //
  // TODO: This is quite conservative.  In practice, convergent_op()
  // is likely to be called unconditionally in the loop.  In this
  // case, the program would be ill-formed (on most architectures)
  // unless n were the same on all threads in a thread group.
  // Assuming n is the same on all threads, any kind of unrolling is
  // safe.  But currently llvm's notion of convergence isn't powerful
  // enough to express this.
  if (Convergent)
    UP.AllowRemainder = false;

Later in computeUnrollCount, there is

  // 2nd priority is unroll count set by pragma.
  unsigned PragmaCount = UnrollCountPragmaValue(L);
  if (PragmaCount > 0) {
    UP.Count = PragmaCount;
    UP.Runtime = true;
    UP.AllowExpensiveTripCount = true;
    UP.Force = true;
    if (UP.AllowRemainder &&
        getUnrolledLoopSize(LoopSize, UP) < PragmaUnrollThreshold)
      return true;
  }

Because UP.AllowRemainder is false, the unroll count specified by pragma is ignored. Later on, computeUnrollCount calculates an unroll count of 512.

Is this a bug? Essentially, this disables unroll count specified by pragma for any loops containing convergent operations, even though the unroll count divides the trip count.

Thanks.

Sam

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180111/5b0acb3f/attachment.html>