<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - AArch64 vs ARMv7 code quality"
href="https://llvm.org/bugs/show_bug.cgi?id=28345">28345</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>AArch64 vs ARMv7 code quality
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>3.8
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: AArch64
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>tulipawn@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>I've rerun the benchmark from issue #27103 on a 2GHz Cortex-A53 (64-bit Linux),
and even though it achieves the expected 1.5x improvement (not counting 64-bit
effects), it still loses against a 1.7GHz Cortex-A5, running ARMv7 code, in
naive implementation performance.
Provided, the last part actually tests the quality of the backend, it looks
rather interesting (table created with cargo-benchcmp script, negative
difference means cortex-a53 is faster):
name cortex-a5 ns/iter cortex-a53 ns/iter diff
ns/iter diff %
mat_mul_f32::m004 1,763 1,098
-665 -37.72%
mat_mul_f32::m005 2,548 1,587
-961 -37.72%
mat_mul_f32::m006 2,889 1,718
-1,171 -40.53%
mat_mul_f32::m007 3,154 1,923
-1,231 -39.03%
mat_mul_f32::m008 3,627 2,138
-1,489 -41.05%
mat_mul_f32::m009 8,142 4,260
-3,882 -47.68%
mat_mul_f32::m012 10,370 5,484
-4,886 -47.12%
mat_mul_f32::m016 17,117 8,621
-8,496 -49.63%
mat_mul_f32::m032 110,929 48,623
-62,306 -56.17%
mat_mul_f32::m064 830,408 328,603
-501,805 -60.43%
mat_mul_f32::m127 6,416,219 2,387,223
-4,028,996 -62.79%
mat_mul_f32::m256 52,750,069 20,490,803
-32,259,266 -61.15%
mat_mul_f32::m512 421,350,950 164,162,031
-257,188,919 -61.04%
mat_mul_f32::mix128x10000x128 531,878,944 216,476,447
-315,402,497 -59.30%
mat_mul_f32::mix16x4 27,059 17,923
-9,136 -33.76%
mat_mul_f32::mix32x2 21,873 15,748
-6,125 -28.00%
mat_mul_f32::mix97 3,815,641 1,449,014
-2,366,627 -62.02%
mat_mul_f64::m004 2,002 1,202
-800 -39.96%
mat_mul_f64::m007 4,593 2,102
-2,491 -54.23%
mat_mul_f64::m008 4,590 2,547
-2,043 -44.51%
mat_mul_f64::m012 14,212 7,781
-6,431 -45.25%
mat_mul_f64::m016 23,181 12,981
-10,200 -44.00%
mat_mul_f64::m032 160,551 88,629
-71,922 -44.80%
mat_mul_f64::m064 1,273,413 665,406
-608,007 -47.75%
mat_mul_f64::m127 10,648,815 5,531,354
-5,117,461 -48.06%
mat_mul_f64::m256 88,419,854 45,800,153
-42,619,701 -48.20%
mat_mul_f64::m512 702,121,682 365,977,216
-336,144,466 -47.88%
mat_mul_f64::mix128x10000x128 876,955,471 493,044,547
-383,910,924 -43.78%
mat_mul_f64::mix16x4 38,284 20,585
-17,699 -46.23%
mat_mul_f64::mix32x2 33,038 13,516
-19,522 -59.09%
mat_mul_f64::mix97 6,344,368 3,202,931
-3,141,437 -49.52%
ref_mat_mul_f32::m004 473 530
57 12.05%
ref_mat_mul_f32::m005 784 941
157 20.03%
ref_mat_mul_f32::m006 1,219 1,537
318 26.09%
ref_mat_mul_f32::m007 1,803 2,689
886 49.14%
ref_mat_mul_f32::m008 2,553 3,783
1,230 48.18%
ref_mat_mul_f32::m009 3,830 5,307
1,477 38.56%
ref_mat_mul_f32::m012 7,829 11,755
3,926 50.15%
ref_mat_mul_f32::m016 17,466 26,824
9,358 53.58%
ref_mat_mul_f32::m032 128,387 202,902
74,515 58.04%
ref_mat_mul_f32::m064 1,018,211 1,584,415
566,204 55.61%
The ref_mat_mul results probably show the optimizer could get smarter :)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>