[llvm-dev] Optimizing UREM for (2^n)-1 numbers
Usman Nadeem via llvm-dev
llvm-dev at lists.llvm.org
Tue Oct 12 13:04:14 PDT 2021
Hi all,
https://stackoverflow.com/a/68084577 provides a simple transformation to optimize the calculation.
I don't know about its the correctness but at least alive tv says that the transformation is correct: https://alive2.llvm.org/ce/z/gDn5xP
Godbolt link comparing assembly for Aarch64 and X86 to show the codegen improvements: https://godbolt.org/z/qqb5enezr
I made a change in instcombine (copied below) to see the impact and looked at some cases. For Aarch64 it resulted in either fewer or same number of instructions and in many cases the new sequence of instructions was cheaper.
This looks like a simple transformation with a good impact, any thoughts?
--
--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -1508,6 +1508,18 @@ Instruction *InstCombinerImpl::visitURem(BinaryOperator &I) {
return BinaryOperator::CreateAnd(Op0, Add);
}
+ // X urem Y --> (X + X/Y) urem Y+1 where Y+1 is a power of 2.
+ if (isa<ConstantInt>(Op1)) {
+ Value *PlusOne = Builder.CreateAdd(Op1, ConstantInt::get(Ty, 1));
+ if (isKnownToBeAPowerOfTwo(PlusOne, /*OrZero*/ false, 0, &I)) {
+ auto *Div = Builder.CreateUDiv(Op0, Op1);
+ auto *Add = Builder.CreateAdd(Op0, Div);
+ return BinaryOperator::CreateWithCopiedFlags(I.getOpcode(), Add, PlusOne,
+ &I);
+ }
+ }
+
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211012/7a56362b/attachment.html>
More information about the llvm-dev
mailing list