[PATCH] D18202: Enable non-power-of-2 pragma unroll counts

Wed Mar 16 12:57:21 PDT 2016

evstupac added inline comments.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:736-739
@@ -740,2 +735,6 @@
 
+  // If loop has an unroll count pragma mark loop as unrolled to prevent
+  // unrolling beyond that requested by the pragma.
+  if (HasPragma && PragmaCount != 0)
+    SetLoopAlreadyUnrolled(L);
   return true;
----------------
mzolotukhin wrote:
> Why do we need to move this?
This is better place for the check. Previously "nounroll" attribute was even if we failed to unroll loop. There are plenty of checks inside UnrollLoop that can stop unrolling.
Basically, now we set "nounroll" only when we have unrolled a loop.

================
Comment at: lib/Transforms/Utils/LoopUnrollRuntime.cpp:354-355
@@ +353,4 @@
+    //
+    // If (2) is true, we know that TripCount really is (1 << BEWidth) and so
+    // the number of iterations that remain to be run in the original loop is a
+    // multiple Count == (1 << Log2(Count)) because Log2(Count) <= BEWidth (we
----------------
mzolotukhin wrote:
> This change is unnecessary.
As now the comment is inside "if" it is shifted on 2 spaces. That causes longest string in it to become 81 symbols. So I just make it shorter than 81.

================
Comment at: lib/Transforms/Utils/LoopUnrollRuntime.cpp:365-367
@@ +364,5 @@
+    // At that point ModValAdd could not overflow as ModValTmp < Count
+    ModVal = B.CreateURem(ModValAdd,
+                          ConstantInt::get(BECount->getType(), Count),
+                          "xtraiter");
+    // And finaly we get correct and overflow safe remainder counter
----------------
mzolotukhin wrote:
> What is the final expression for `ModVal`? From the code it looks like
> ```
>   ModVal = ((BECount % Count) + 1) % Count
> ```
> Do we really need double urem here? If so, could you please add a comment explaining why we need exactly this?
Potentially BECount + 1 can unsigned overflow, while (BECount % Count) +1 can not. That is why double URem is used. I can try to simplify this to something like this
```
ModVal = ((Count - ModValAdd) >> (ModVal->getSclarType()->getPrimitiveSizeInBits() - 1) & ModValAdd
``` or to select here or somewhere in further combiners. However non-power-of-2 is not frequent case as for runtime unrolling it comes only from user specified pragma.


Repository:
  rL LLVM

http://reviews.llvm.org/D18202