<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Thumb code generation incorrectly uses smmls"
href="https://llvm.org/bugs/show_bug.cgi?id=28701">28701</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Thumb code generation incorrectly uses smmls
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>martin@martin.st
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=16804" name="attach_16804" title="Sample code illustrating how the smmls instruction causes differing results">attachment 16804</a> <a href="attachment.cgi?id=16804&action=edit" title="Sample code illustrating how the smmls instruction causes differing results">[details]</a></span>
Sample code illustrating how the smmls instruction causes differing results
The optimizer can choose to use the smmul instruction for functions like this,
which is correct:
int MULH(int a, int b)
{
return ((int64_t)a * (int64_t)b) >> 32;
}
With optimization, this ends up compiled as this:
00000000 <MULH>:
0: fb51 f000 smmul r0, r1, r0
4: 4770 bx lr
(And similarly in ARM mode.)
When combined with a subtraction, this can end up as the smmls instruction, in
cases like this:
int test_MULH(int a, int b, int c)
{
return c - MULH_inline(a, b);
}
In ARM mode, this ends up compiled like this:
00000008 <test_inline>:
8: e750f011 smmul r0, r1, r0
c: e0420000 sub r0, r2, r0
10: e12fff1e bx lr
This is correct, but in thumb mode, this gets optimized further, to use the
smmls instruction:
00000006 <test_inline>:
6: fb61 2000 smmls r0, r1, r0, r2
a: 4770 bx lr
This is wrong, since smmls doesn't truncate the result of the multiplication
before accumulating. The description of smmls is:
"Signed Most Significant Word Multiply Subtract multiplies two signed 32-bit
values, subtracts the result from a 32-bit accumulate value that is shifted
left by 32 bits, and extracts the most significant 32 bits of the result of
that subtraction."
Thus, in practice, as long as the lower 32 bit of the result of the
multiplication isn't zero, the result will end up one less than the actual.
The attached code sample, built with optimization in thumb mode, illustrates
the issue. To reproduce:
$ clang -target armv7-linux-gnueabihf test-smmls.c -o test-smmls -O -mthumb
$ ./test-smmls
differing: 6324 vs 6323</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>