<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Loosing optimization for X % C == 0"
href="https://bugs.llvm.org/show_bug.cgi?id=50334">50334</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Loosing optimization for X % C == 0
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>12.0
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Common Code Generator Code
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>cassio.neri@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Since version 9, the code generated for say, y % 400 == 0, uses the modular
inverse optimisation which performs just one multiplication (as opposed to two
as it used to do in (y/400)*400 == y).
However, in a reasonable implementation of is_leap_year the code generated
looses the optimisation and falls back to the old implementation. Here are C++
and corresponding assembly. (Notice that for y % 100 it has no issue in
applying the optimisation.)
auto is_leap_year(unsigned y) {
return y % 100 != 0 ? y % 4 == 0 : y % 400 == 0;
}
is_leap_year(unsigned int): # @is_leap_year(unsigned int)
imull $-1030792151, %edi, %eax # imm = 0xC28F5C29
rorl $2, %eax
cmpl $42949673, %eax # imm = 0x28F5C29
jb .LBB2_2
andl $3, %edi
testl %edi, %edi
sete %al
retq
.LBB2_2:
movl %edi, %eax
imulq $1374389535, %rax, %rax # imm = 0x51EB851F
shrq $39, %rax
imull $400, %eax, %eax # imm = 0x190 # <--- Multiplication by 400
subl %eax, %edi
testl %edi, %edi
sete %al
retq
Worse, when y has type type it doesn't even apply the old Granlund and
Montgomery optimisation and resorts to an idiv instruction. It seems it prefers
avoiding branching (using cmov). I doubt this is a good idea.
auto is_leap_year(int y) {
return y % 100 != 0 ? y % 4 == 0 : y % 400 == 0;
}
is_leap_year(int): # @is_leap_year(int)
movl %edi, %eax
imull $-1030792151, %edi, %ecx # imm = 0xC28F5C29
addl $85899344, %ecx # imm = 0x51EB850
rorl $2, %ecx
cmpl $42949673, %ecx # imm = 0x28F5C29
movl $400, %ecx # imm = 0x190
movl $4, %esi
cmovbl %ecx, %esi
cltd
idivl %esi # <-- Division by %esi which contains either 4 or 400
testl %edx, %edx
sete %al
retq
See the above and other relevant code here:
<a href="https://godbolt.org/z/zdGPTT83q">https://godbolt.org/z/zdGPTT83q</a>
You might also be interested in these (non scientific) benchmarks:
<a href="https://quick-bench.com/q/zQp1vXKKpWFvH0Et7nxG6MHnNfI">https://quick-bench.com/q/zQp1vXKKpWFvH0Et7nxG6MHnNfI</a></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>