[llvm-dev] question about unrolling loops with convergent instructions
Liu, Yaxun (Sam) via llvm-dev
llvm-dev at lists.llvm.org
Thu Jan 11 08:26:37 PST 2018
I have a loop with convergent instructions with a loop count of 1024. I use pragma to specify unroll count to be 32. However, the loop was unrolled by 512, which results in very long compilation time.
In tryToUnrollLoop, there is
// If the loop contains a convergent operation, the prelude we'd add
// to do the first few instructions before we hit the unrolled loop
// is unsafe -- it adds a control-flow dependency to the convergent
// operation. Therefore restrict remainder loop (try unrollig without).
//
// TODO: This is quite conservative. In practice, convergent_op()
// is likely to be called unconditionally in the loop. In this
// case, the program would be ill-formed (on most architectures)
// unless n were the same on all threads in a thread group.
// Assuming n is the same on all threads, any kind of unrolling is
// safe. But currently llvm's notion of convergence isn't powerful
// enough to express this.
if (Convergent)
UP.AllowRemainder = false;
Later in computeUnrollCount, there is
// 2nd priority is unroll count set by pragma.
unsigned PragmaCount = UnrollCountPragmaValue(L);
if (PragmaCount > 0) {
UP.Count = PragmaCount;
UP.Runtime = true;
UP.AllowExpensiveTripCount = true;
UP.Force = true;
if (UP.AllowRemainder &&
getUnrolledLoopSize(LoopSize, UP) < PragmaUnrollThreshold)
return true;
}
Because UP.AllowRemainder is false, the unroll count specified by pragma is ignored. Later on, computeUnrollCount calculates an unroll count of 512.
Is this a bug? Essentially, this disables unroll count specified by pragma for any loops containing convergent operations, even though the unroll count divides the trip count.
Thanks.
Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180111/5b0acb3f/attachment.html>
More information about the llvm-dev
mailing list