[llvm-dev] Optimizing UREM for (2^n)-1 numbers

Tue Oct 12 13:04:14 PDT 2021

Hi all,

https://stackoverflow.com/a/68084577 provides a simple transformation to optimize the calculation.
I don't know about its the correctness but at least alive tv says that the transformation is correct: https://alive2.llvm.org/ce/z/gDn5xP

Godbolt link comparing assembly for Aarch64 and X86 to show the codegen improvements: https://godbolt.org/z/qqb5enezr

I made a change in instcombine (copied below) to see the impact and looked at some cases. For Aarch64 it resulted in either fewer or same number of instructions and in many cases the new sequence of instructions was cheaper.

This looks like a simple transformation with a good impact, any thoughts?
--

--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -1508,6 +1508,18 @@ Instruction *InstCombinerImpl::visitURem(BinaryOperator &I) {
     return BinaryOperator::CreateAnd(Op0, Add);
   }

+  // X urem Y --> (X + X/Y) urem Y+1 where Y+1 is a power of 2.
+  if (isa<ConstantInt>(Op1)) {
+    Value *PlusOne = Builder.CreateAdd(Op1, ConstantInt::get(Ty, 1));
+    if (isKnownToBeAPowerOfTwo(PlusOne, /*OrZero*/ false, 0, &I)) {
+      auto *Div = Builder.CreateUDiv(Op0, Op1);
+      auto *Add = Builder.CreateAdd(Op0, Div);
+      return BinaryOperator::CreateWithCopiedFlags(I.getOpcode(), Add, PlusOne,
+                                                   &I);
+    }
+  }
+

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211012/7a56362b/attachment.html>