<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
{mso-style-priority:99;
mso-style-link:"Plain Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.PlainTextChar
{mso-style-name:"Plain Text Char";
mso-style-priority:99;
mso-style-link:"Plain Text";
font-family:"Calibri","sans-serif";}
.MsoChpDefault
{mso-style-type:export-only;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoPlainText>Hi folks,<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>As Tim pointed out, we recently had the opportunity to collect 64-bit benchmark performance data for GCC 4.9, AArch64 and ARM64 compilers on a real hardware. It is a cortex-a53 device. Due to proprietary reasons we cannot share the full hardware configuration.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>The preliminary results were shared at the hackers lab at EuroLLVM yesterday. For those who could not make it, below is the summarized performance data.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>A positive number means the ARM64 run is better by the number %. A negative number means the baseline (GCC 4.9 or AArch64) is better by the number %.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>Tuning of AArch64 backend on this processor has not been completely done yet (some initial work has started on modeling cortex-a53). But we quickly investigated the bad vectorized code in some of the tests (Linpack for example) and identified straightforward fixes that improved AArch64 performance (similar patches are present in ARM64, e.g. loop unroll default limit, unaligned memory accesses, etc.). These patches are going to the AArch64 commits list for review.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>This experiment indicates that from the point of view of correctness and performance either ARM64 or AArch64 could be the base compiler of choice if the known correctness issues (in ARM64) and lack of performance tuning (in AArch64) are addressed.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>However much more work has to be done to catch up with GCC 4.9 middle-end and backend optimizations.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0 width=915 style='width:548.7pt;border-collapse:collapse'><tr><td width=264 valign=top style='width:158.55pt;border:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal align=center style='text-align:center'><b><span style='font-size:10.0pt'>Benchmark<o:p></o:p></span></b></p></td><td width=185 valign=top style='width:111.15pt;border:solid windowtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal align=center style='text-align:center'><b><span style='font-size:10.0pt'>ARM64 vs GCC 4.9 %</span></b><b><span style='font-size:10.0pt'><o:p></o:p></span></b></p></td><td width=203 valign=top style='width:121.5pt;border:solid windowtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal align=center style='text-align:center'><b><span style='font-size:10.0pt'>ARM64 vs AArch64 %</span></b><b><span style='font-size:10.0pt'><o:p></o:p></span></b></p></td><td width=263 valign=top style='width:157.5pt;border:solid windowtext 1.0pt;border-left:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal align=center style='text-align:center'><b><span style='font-size:10.0pt'>ARM64 vs AArch64 patched %</span></b><b><span style='font-size:10.0pt'><o:p></o:p></span></b></p></td></tr><tr><td width=264 valign=top style='width:158.55pt;border:solid windowtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>EEMBC (no consumer) geomean</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=185 valign=top style='width:111.15pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-17</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=203 valign=top style='width:121.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>1</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=263 valign=top style='width:157.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-2</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td></tr><tr><td width=264 valign=top style='width:158.55pt;border:solid windowtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>EEMBC (consumer only) geomean</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=185 valign=top style='width:111.15pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-21</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=203 valign=top style='width:121.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-2</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=263 valign=top style='width:157.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-5</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td></tr><tr><td width=264 valign=top style='width:158.55pt;border:solid windowtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>Linpack Double</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=185 valign=top style='width:111.15pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-29</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=203 valign=top style='width:121.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>45</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=263 valign=top style='width:157.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-1</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td></tr><tr><td width=264 valign=top style='width:158.55pt;border:solid windowtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>Linpack Single</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=185 valign=top style='width:111.15pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-51</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=203 valign=top style='width:121.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>40</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=263 valign=top style='width:157.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>1</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td></tr><tr><td width=264 valign=top style='width:158.55pt;border:solid windowtext 1.0pt;border-top:none;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>SPEC2000 geomean</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=185 valign=top style='width:111.15pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>-6</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=203 valign=top style='width:121.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>0</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td><td width=263 valign=top style='width:157.5pt;border-top:none;border-left:none;border-bottom:solid windowtext 1.0pt;border-right:solid windowtext 1.0pt;padding:0in 5.4pt 0in 5.4pt'><p class=MsoNormal><span style='font-size:10.0pt'>1</span><span style='font-size:10.0pt'><o:p></o:p></span></p></td></tr></table><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>Thanks,<o:p></o:p></p><p class=MsoPlainText>Ana.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>-----Original Message-----<br>From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On Behalf Of Tim Northover<br>Sent: Tuesday, April 08, 2014 12:04 AM<br>To: LLVM Developers Mailing List<br>Subject: Re: [LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM</p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>Hi again,<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>In my original message I was attempting to summarise the key arguments as I saw them. Other points came up in the discussion, which Ana kindly recorded and I'll summarise here:<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>First, extra arguments brought up in favour of each backend (I'll mention duplicates too so that the list is as complete as possible):<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>+ Register class usage in ARM64 is cleaner.<o:p></o:p></p><p class=MsoPlainText>+ FastISel is on ARM64, but not AArch64. Some TableGen work will be<o:p></o:p></p><p class=MsoPlainText>needed to enable it because of how patterns are written there.<o:p></o:p></p><p class=MsoPlainText>+ There is no macro support in AArch64.<o:p></o:p></p><p class=MsoPlainText>+ Both NEON syntax variants (general & iOS) are supported by ARM64 now.<o:p></o:p></p><p class=MsoPlainText>+ ARM64 assumes neon enabled by default, and indeed has no notion that<o:p></o:p></p><p class=MsoPlainText>a CPU might not have NEON. Instructions will need to be predicated to check NEON is present and probably some corresponding .cpp changes where it was also assumed.<o:p></o:p></p><p class=MsoPlainText>+ Inline asm is possibly better in ARM64.<o:p></o:p></p><p class=MsoPlainText>+ Anecdotal evidence suggests it's easier to debug MC layer issues on<o:p></o:p></p><p class=MsoPlainText>ARM64 than on AArch64.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>Other important points that we discussed:<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>+ We need to setup a buildbot for performance using some real hardware<o:p></o:p></p><p class=MsoPlainText>(volunteers with hardware?) so patches can be validated in the supported targets. And also for correctness using qemu.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>+ Google is working on a framework to build and run benchmarks – to be<o:p></o:p></p><p class=MsoPlainText>available soon? And should enable the buildbot setup from item above.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>+ We need to sort out differences between cortex-a53 and Cyclone model<o:p></o:p></p><p class=MsoPlainText>descriptions (both use the new approach for MI scheduler, but one requires annotating instructions and the other does not). We should pin down Andy and get him to describe the perfect machine model.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>Cheers.<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>Tim<o:p></o:p></p><p class=MsoPlainText><o:p> </o:p></p><p class=MsoPlainText>_______________________________________________<o:p></o:p></p><p class=MsoPlainText>LLVM Developers mailing list<o:p></o:p></p><p class=MsoPlainText><a href="mailto:LLVMdev@cs.uiuc.edu"><span style='color:windowtext;text-decoration:none'>LLVMdev@cs.uiuc.edu</span></a> <a href="http://llvm.cs.uiuc.edu"><span style='color:windowtext;text-decoration:none'>http://llvm.cs.uiuc.edu</span></a><o:p></o:p></p><p class=MsoPlainText><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev"><span style='color:windowtext;text-decoration:none'>http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</span></a><o:p></o:p></p></div></body></html>