<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><span class="vcard"><a class="email" href="mailto:james.molloy@arm.com" title="James Molloy <james.molloy@arm.com>"> <span class="fn">James Molloy</span></a>
</span> changed
<a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED INVALID - ARM code runs 2x slower compared to gcc"
href="https://llvm.org/bugs/show_bug.cgi?id=26450">bug 26450</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Status</td>
<td>NEW
</td>
<td>RESOLVED
</td>
</tr>
<tr>
<td style="text-align:right;">Resolution</td>
<td>---
</td>
<td>INVALID
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED INVALID - ARM code runs 2x slower compared to gcc"
href="https://llvm.org/bugs/show_bug.cgi?id=26450#c5">Comment # 5</a>
on <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED INVALID - ARM code runs 2x slower compared to gcc"
href="https://llvm.org/bugs/show_bug.cgi?id=26450">bug 26450</a>
from <span class="vcard"><a class="email" href="mailto:james.molloy@arm.com" title="James Molloy <james.molloy@arm.com>"> <span class="fn">James Molloy</span></a>
</span></b>
<pre>Hi,
OK, there's two things here:
Firstly, it seems __umodsi3 and friends are significantly slower than
__aeabi_idivmod. GCC is generating __aeabi_idivmod - perhaps we should? We
select __modsi3 unless the target is EABI or Android - I suspect that should be
EABI, Android or GNUEABI.
GCC 4.9: 1.24s
Clang 3.7: 3.48s
Clang 3.7 (using __aeabi_idivmod): 1.15s
Secondly, you're not specifying a CPU. That's why your division is going out to
the library. Unless you're on a Cortex-A9, you'll have hardware division. Use
-mcpu to enable it.
GCC 4.9 with -mcpu=cortex-a15: 276ms
Clang 3.7 with -mcpu=cortex-a15: 258ms
(I had to switch to using perf stat's task-clock metric because time elapsed
was getting too noisy)
By the way: "I've just discovered how immature LLVM/Clang was on ARM." (from
<a href="https://users.rust-lang.org/t/executable-size-and-performance-vs-c/4496/34">https://users.rust-lang.org/t/executable-size-and-performance-vs-c/4496/34</a>)
That's a little over the top - the ARM backend is around 10 years old now, it's
fairly mature.
James</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>