[llvm-bugs] [Bug 50334] New: Loosing optimization for X % C == 0

via llvm-bugs llvm-bugs at lists.llvm.org
Thu May 13 14:52:42 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=50334

            Bug ID: 50334
           Summary: Loosing optimization for X % C == 0
           Product: libraries
           Version: 12.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: cassio.neri at gmail.com
                CC: llvm-bugs at lists.llvm.org

Since version 9, the code generated for say, y % 400 == 0, uses the modular
inverse optimisation which performs just one multiplication (as opposed to two 
as it used to do in (y/400)*400 == y).

However, in a reasonable implementation of is_leap_year the code generated
looses the optimisation and falls back to the old implementation. Here are C++
and corresponding assembly. (Notice that for y % 100 it has no issue in
applying the optimisation.)

auto is_leap_year(unsigned y) {
  return y % 100 != 0 ? y % 4 == 0 : y % 400 == 0;
}

is_leap_year(unsigned int): # @is_leap_year(unsigned int)
  imull $-1030792151, %edi, %eax # imm = 0xC28F5C29
  rorl $2, %eax
  cmpl $42949673, %eax # imm = 0x28F5C29
  jb .LBB2_2
  andl $3, %edi
  testl %edi, %edi
  sete %al
  retq
.LBB2_2:
  movl %edi, %eax
  imulq $1374389535, %rax, %rax # imm = 0x51EB851F
  shrq $39, %rax
  imull $400, %eax, %eax # imm = 0x190 # <--- Multiplication by 400
  subl %eax, %edi
  testl %edi, %edi
  sete %al
  retq

Worse, when y has type type it doesn't even apply the old Granlund and
Montgomery optimisation and resorts to an idiv instruction. It seems it prefers
avoiding branching (using cmov). I doubt this is a good idea.

auto is_leap_year(int y) {
  return y % 100 != 0 ? y % 4 == 0 : y % 400 == 0;
}

is_leap_year(int): # @is_leap_year(int)
  movl %edi, %eax
  imull $-1030792151, %edi, %ecx # imm = 0xC28F5C29
  addl $85899344, %ecx # imm = 0x51EB850
  rorl $2, %ecx
  cmpl $42949673, %ecx # imm = 0x28F5C29
  movl $400, %ecx # imm = 0x190
  movl $4, %esi
  cmovbl %ecx, %esi
  cltd
  idivl %esi # <-- Division by %esi which contains either 4 or 400
  testl %edx, %edx
  sete %al
  retq

See the above and other relevant code here:
https://godbolt.org/z/zdGPTT83q

You might also be interested in these (non scientific) benchmarks:
https://quick-bench.com/q/zQp1vXKKpWFvH0Et7nxG6MHnNfI

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210513/0ebecde5/attachment-0001.html>


More information about the llvm-bugs mailing list