<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Missed optimization: removal of 'inc' and 'add' instructions"
href="https://bugs.llvm.org/show_bug.cgi?id=44303">44303</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Missed optimization: removal of 'inc' and 'add' instructions
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>MacOS X
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>C++
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>jhg023@bucknell.edu
</td>
</tr>
<tr>
<th>CC</th>
<td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
</td>
</tr></table>
<p>
<div>
<pre>uint64_t test( uint64_t a, uint64_t b, uint64_t n, int norm ) { // Line 1
uint64_t prod = a * b; // Line 2
uint64_t r = ( norm - ( prod + 1 ) * n ) + n; // Line 3
return ( r < n ) ? r : ( r - n ); // Line 4
}
The above function generates the following assembly, according to Godbolt
(<a href="https://godbolt.org/z/6AKLMu">https://godbolt.org/z/6AKLMu</a>):
test(unsigned long, unsigned long, unsigned long, int):
# @test(unsigned long, unsigned long, unsigned long, int)
imul rdi, rsi
movsxd rax, ecx
inc rdi
imul rdi, rdx
sub rax, rdi
add rdx, rax
cmovb rax, rdx
ret
As you can see, line 3 of the function is equivalent to the following snippet:
uint64_t r = ( norm - prod * n );
With this change, I would expect the 'inc' and 'add' instructions to simply be
removed from the generated assembly. However, this isn't the case. In fact, it
generates assembly that's longer than the initial assembly
(<a href="https://godbolt.org/z/LkV4_D">https://godbolt.org/z/LkV4_D</a>):
test(unsigned long, unsigned long, unsigned long, int):
# @test(unsigned long, unsigned long, unsigned long, int)
imul rdi, rsi
movsxd rax, ecx
imul rdi, rdx
sub rax, rdi
xor ecx, ecx
cmp rax, rdx
cmovae rcx, rdx
sub rax, rcx
ret
Is this correct, or rather a missed optimization?</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>