<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [x86] Silly code generation for _addcarry_u32/_addcarry_u64"
href="https://llvm.org/bugs/show_bug.cgi?id=24545">24545</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[x86] Silly code generation for _addcarry_u32/_addcarry_u64
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>3.7
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>myriachan@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>x86 intrinsics _addcarry_u32 and _addcarry_u64 generate silly code. For
example, the following function to get the result of a 64-bit addition (the XOR
is to the output clearer):
u64 testcarry(u64 a, u64 b, u64 c, u64 d)
{
u64 result0, result1;
_addcarry_u64(_addcarry_u64(0, a, c, &result0), b, d, &result1);
return result0 ^ result1;
}
This is the code generated with -O1, -O2 and -O3:
xor eax, eax
add al, -1
adc rdi, rdx
mov qword ptr [rsp - 8], rdi
setb al
add al, -1
adc rsi, rcx
mov qword ptr [rsp - 16], rsi
xor rsi, qword ptr [rsp - 8]
mov rax, rsi
ret
The first silliness is that _addcarry_u64 does not optimize a compile-time
constant 0 being the first carry parameter. Instead of "adc", it should just
use "add".
The second silliness is with the use of r8b to store the carry flag, then using
"add r8b, -1" to put the result back into carry.
The third silliness is that _addcarry_u64 is taking its pointer parameter too
literally; it shouldn't be storing values to memory at all.
Instead, the code should be something like this:
add rdx, rdi
mov rax, rdx
adc rcx, rsi
xor rax, rcx
ret
Naturally, for something this simple, I'd use unsigned __int128, but this came
up in large number math.
Cross-filed with GCC (with different generated code since it's a different
compiler =^-^=):
<a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67317">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67317</a></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>