<div dir="ltr"><div><div><div>ARM processors are only partially supported by llvm-mca.<br>At the moment, the tool is unable to resolve variant scheduling classes, and ARM scheduling models often use variant schedling classes to model the latency profile of instructions.<br><br></div>Strictly speaking, what Matt wrote is true: llvm-mca knows how to analyze code for our-of-order processors that have a scheduling model in LLVM.<br>However, the user experience may be poor for ARM proessors at the moment. It will get better in future (there is a plan to add support for variant scheduling classes; I will send an RFC on the mailing list soon).<br><br></div><div>-Andrea<br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 9, 2018 at 7:16 AM, via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div link="blue" vlink="purple" lang="EN-US">

<div class="m_-8829580913763818245WordSection1">

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Hi Yin,<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">MCA does support the –mcpu and –mtriple options.  We have one arm test in llvm/test/tools/llvm-mca/ARM for a cortex-9, which is an Out of Order chip.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Hope that helps!<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">-Matt<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>

<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">

<div>

<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">

<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Yin Liu <<a href="mailto:yinliu.tiger@gmail.com" target="_blank">yinliu.tiger@gmail.com</a>>

<br>

<b>Sent:</b> Tuesday, May 8, 2018 2:49 PM<br>

<b>To:</b> Davis, Matthew <<a href="mailto:Matthew.Davis@sony.com" target="_blank">Matthew.Davis@sony.com</a>><br>

<b>Cc:</b> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<b>Subject:</b> Re: [llvm-dev] Is there any relationship between IR instruction and execution time<u></u><u></u></span></p>

</div>

</div><div><div class="h5">

<p class="MsoNormal"><u></u> <u></u></p>

<div>

<p class="MsoNormal">Hi Matt,<u></u><u></u></p>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal">Thanks you so much for the reply!<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal">I've tried the llvm-mca, it is helpful.<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal">I was wondering whether the llvm-mca support the assembly code for the ARM?<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal">I cross-compile the test file for ARM like that: clang test.c -O2 -target arm-linux-gnueabihf -static -S -o test.s<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal">If I want to check the performance using llvm-mca, is there any option of "-mcpu" for ARM ?<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal">Thanks,<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal">Yin<u></u><u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

<div>

<p class="MsoNormal">2018-05-07 21:52 GMT-04:00 <<a href="mailto:Matthew.Davis@sony.com" target="_blank">Matthew.Davis@sony.com</a>>:<u></u><u></u></p>

<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">

<p class="MsoNormal">Hi Yin,<br>

<br>

From: llvm-dev <<a href="mailto:llvm-dev-bounces@lists.llvm.org" target="_blank">llvm-dev-bounces@lists.llvm.<wbr>org</a>> On Behalf Of Yin Liu via llvm-dev<br>

> Hello,<br>

> <br>

> As is known to all, there is a relationship between program's instructions and its execution time. In other words, we can estimate the execution time based on the number of program > instructions.<br>

> <br>

> I'm curious about what the relationship between IR instruction and execution time. I know the number of program instructions and the execution time is highly related to the<br>

> platform and architecture, while the IR instruction is independent and intermediate. But, intuitively, there may be some relationship between IR instruction and execution time.<br>

> <br>

> Would it be possible to give me some advice about it?<br>

<br>

What instructions finally get emitted by the compiler is highly dependent on the specified target.  As you pointed out, IR is relatively abstract, and can at best only generate a "rough" estimate to timing.  Maybe that loss of fidelity is acceptable in your

 case.  Be aware that there are also target specific optimizations that operate after the IR is lowered to a target friendly representation.  Any early approximation of IR performance will be less accurate after target specific optimization passes are ran. 

   For more accurate results, you will need to wait until the IR is lowered to the target architecture and emitted as assembly or object code.  But it seems that might be too late for what you are looking for.  In any case, if you do want to analyze the assembly

 code, then look no further than llvm's Machine Code Analyzer(MCA).  This tool takes an assembly code as input and generates throughput and latency information.  For more details see:

<a href="https://llvm.org/docs/CommandGuide/llvm-mca.html" target="_blank">https://llvm.org/docs/<wbr>CommandGuide/llvm-mca.html</a><br>

<br>

-Matt<u></u><u></u></p>

</blockquote>

</div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

</div></div></div>

</div>

</div>


<br>______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br></blockquote></div><br></div>