<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, May 8, 2016 at 2:14 PM, Jie Chen <span dir="ltr"><<a href="mailto:Jie.Chen@mathworks.com" target="_blank">Jie.Chen@mathworks.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">




<div dir="ltr" style="font-size:12pt;color:#000000;background-color:#ffffff;font-family:Calibri,Arial,Helvetica,sans-serif">
<p>Hi David,<br>
</p>
<p><br>
</p>
<p>Thanks for your great explanations not only covering llvm but also gcc! To understand the code layout optimization better, I slightly changed my code, basically, calling the hot() function in the first if-branch instead of at the last else branch (see my
 modified code below). This essentially reduces branch instructions being executed, and possibly improves the branch predictor performance. On my Mac, I got ~6% performance improvement (clang++ -O2) with this code change. Looking at the default.profraw data,
 I can see it has the information that the optimizer could use to make a similar optimization as my manual approach. <span style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:16px;background-color:rgb(255,255,255)">I was hoping llvm PGO
 could do the same thing</span><span style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:16px;background-color:rgb(255,255,255)">. </span></p></div></blockquote><div><br></div><div>yes -- this is a missing profile guided control flow optimization -- reducing hot path's control-dependence height by branch re-ordering -- possible when branch conditions are mutually exclusive.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" style="font-size:12pt;color:#000000;background-color:#ffffff;font-family:Calibri,Arial,Helvetica,sans-serif">
<p>I am excited to hear from you that more infrastructure changes are undergoing which will  improve the PGO support. So as for now, what is the list of PGO optimizations that I can write some code and see immediate improvement from llvm? It would be great
 to know such details. :-)<br>
</p>
<p><br></p></div></blockquote><div>What I can tell you is that there are many missing ones (that can benefit from profile): such as profile aware LICM (patch pending), speculative PRE, loop unrolling, loop peeling, auto vectorization, inlining, function splitting, function layout, function outlinling,  profile driven size optimization, induction variable optimization/strength reduction, stringOp specialization/optimization/inlining, switch peeling/lowering etc. The biggest profile user today include ralloc, BB layout, ifcvt, shrinkwrapping etc, but there should be rooms to be improvement there too.</div><div><br></div><div>thanks,</div><div><br></div><div>David</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" style="font-size:12pt;color:#000000;background-color:#ffffff;font-family:Calibri,Arial,Helvetica,sans-serif"><p>
</p>
<p>Best,<br>
</p>
<p><br>
</p>
<p>Jie<br>
</p>
<p><br>
</p>
<p>//main2.cpp: manual reordering of branches <br>
</p><div><div class="h5">
<div>#include <iostream> </div>
<div>#include <stdlib.h></div>
<div><br>
</div>
<div>using namespace std;</div>
<div><br>
</div>
<div>long long hot() {</div>
<div><span style="white-space:pre-wrap"></span>long long x = 0;</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>for (int i = 0; i < 1000; i++) {</div>
<div><span style="white-space:pre-wrap"></span>x += i^2;</div>
<div><span style="white-space:pre-wrap"></span>}</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>return x;</div>
<div>}</div>
<div><br>
</div>
<div>long long cold() {</div>
<div><span style="white-space:pre-wrap"></span>long long y = 0;</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>for (int i = 0; i < 1000; i++) {</div>
<div><span style="white-space:pre-wrap"></span>y += i^2;</div>
<div><span style="white-space:pre-wrap"></span>}</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>return y;</div>
<div><br>
</div>
<div>}</div>
<div><br>
</div>
<div>long long foo() {</div>
<div><span style="white-space:pre-wrap"></span>long long y = 0;</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>for (int i = 0; i < 1000; i++) {</div>
<div><span style="white-space:pre-wrap"></span>y *= i^2;</div>
<div><span style="white-space:pre-wrap"></span>}</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>return y*2;</div>
<div><br>
</div>
<div>}</div>
<div><br>
</div>
<div>long long bar() {</div>
<div><span style="white-space:pre-wrap"></span>long long y = 0;</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>for (int i = 0; i < 1000; i++) {</div>
<div><span style="white-space:pre-wrap"></span>y *= i^2;</div>
<div><span style="white-space:pre-wrap"></span>}</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>return y*3;</div>
<div><br>
</div>
<div>}</div>
<div><br>
</div>
<div>#define SIZE 10000000</div>
<div><br>
</div>
<div>int main() {</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>int* a = (int *)calloc(SIZE, sizeof(int));</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>a[100] = 1;</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>long long sum = 0;</div>
<div><br>
</div>
<div><span style="white-space:pre-wrap"></span>for (int i = 0; i < SIZE; i++) {</div>
</div></div><div><span style="white-space:pre-wrap"></span>if (a[i] < 1) {</div>
<div><span style="white-space:pre-wrap"></span>sum += hot();</div>
<div><span style="white-space:pre-wrap"></span>} else if (a[i] == 1) {</div><span class="">
<div><span style="white-space:pre-wrap"></span>sum += cold();</div>
<div><span style="white-space:pre-wrap"></span>} else if (a[i] < 1) {</div>
<div><span style="white-space:pre-wrap"></span>sum += bar();</div>
<div><span style="white-space:pre-wrap"></span>sum += foo();</div>
<div><span style="white-space:pre-wrap"></span>}</div>
<div><span style="white-space:pre-wrap"></span>}</div>
<div><span style="white-space:pre-wrap"></span></div>
</span><span class=""><div><span style="white-space:pre-wrap"></span>cout << sum << endl;</div>
<div><span style="white-space:pre-wrap"></span></div>
<div><span style="white-space:pre-wrap"></span>return 0;<span style="white-space:pre-wrap">
</span></div>
<div>}</div>
<div><br>
<br>
</div>
<p><br>
</p>
<p><br>
</p>
<p> <br>
</p>
<p><br>
</p>
<p><br>
</p>
</span><div style="color:rgb(33,33,33)">
<hr style="display:inline-block;width:98%">
<div dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Xinliang David Li <<a href="mailto:davidxl@google.com" target="_blank">davidxl@google.com</a>><br>
<b>Sent:</b> Friday, May 6, 2016 8:06 PM<br>
<b>To:</b> Jie Chen<br>
<b>Cc:</b> llvm-dev<br>
<b>Subject:</b> Re: About Clang llvm PGO</font>
<div> </div>
</div><div><div class="h5">
<div>
<div dir="ltr">Thanks for testing out LLVM PGO and evaluated the performance.
<div><br>
</div>
<div>We are currently still more focused on infrastructure improvement which is the foundation for performance improvement.  We are making great progress in this direction, but there are still some key missing pieces such as profile data in inliner etc. We
 are working on that. Once those are done, more focus will be on making more passes profile aware, make existing profile aware passes better (e.g, code layout etc).</div>
<div><br>
</div>
<div>I looked at this particular example. GCC PGO can reduce the runtime by half, while LLVM's PGO makes no performance difference as you noticed.<br>
</div>
<div><br>
</div>
<div>For GCC case, PGO itself contributes about 15% performance boost. The majority of the performance improvement comes from loop vectorization. Note that trunk GCC does not turn on vectorization at O2, but O3 or O2 with PGO.</div>
<div><br>
</div>
<div>LLVM also vectorizes the key loops. However compared with GCC's vectorizor, LLVM's auto-vectorizer produces worse code (e.g, long sequence of instructions to do sign extension etc): ~6.5instr/iter vs ~9instr/iter.  GCC also does loop unroll after vectorization
 which also helped a little more.   LLVM's vectorization actually hurts performance a little.</div>
<div><br>
</div>
<div>We will look into this issue.</div>
<div><br>
</div>
<div>thanks,</div>
<div><br>
</div>
<div>David</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, May 6, 2016 at 2:04 PM, Jie Chen <span dir="ltr">
<<a href="mailto:Jie.Chen@mathworks.com" target="_blank">Jie.Chen@mathworks.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>Hi David,</div>
<div><br>
</div>
<div>I am a performance engineer from MathWorks. I am currently exploring building our products with PGO on the Mac platform. While searching for llvm PGO solutions, I came across your name many times. So I thought you were probably the guy behind llvm’s PGO
 implementation! :-) Here is what confused me regarding the llvm PGO capability. I started with a small code (see my code at the end of this email) which I saw more than 10% performance improvement with PGO on Linux GCC (g++ -O2, -profile-geneate, -profile-use).
 I wrote this code based on the assumption that llvm would rearrange the hot/code branches based on profile run. But when tried with Apple Clang and Clang on ubuntu, I did not see any performance improvement. Since I do not know the implementation detail of
 llvm PGO, I am confused by not seeing performance improvement as I saw it with GCC (probably with Visual Studio PGO as well). Could you please offer me some insights into the issue? Or on a further question, what kind of code would benefit from llvm PGO optimization? </div>
<div><br>
</div>
<div>Best,</div>
<div><br>
</div>
<div>Jie Chen</div>
<div>MathWorks</div>
<div><br>
</div>
<div><br>
</div>
<div>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
<span style="color:#d53bd3">#include </span><iostream></p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
<span style="color:#d53bd3">#include </span><stdlib.h></p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(52,189,38)">
<span style="color:#ce7924">using</span><span style="color:#000000"> </span>namespace<span style="color:#000000"> std;</span></p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> hot() {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> x =
<span style="color:#c33720">0</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">for</span> (<span style="color:#34bd26">int</span> i =
<span style="color:#c33720">0</span>; i < <span style="color:#c33720">1000</span>; i++) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        x += i^<span style="color:#c33720">2</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    }</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">return</span> x;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
}</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> cold() {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> y =
<span style="color:#c33720">0</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">for</span> (<span style="color:#34bd26">int</span> i =
<span style="color:#c33720">0</span>; i < <span style="color:#c33720">1000</span>; i++) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        y += i^<span style="color:#c33720">2</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    }</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">return</span> y;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
}</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> foo() {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> y =
<span style="color:#c33720">0</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">for</span> (<span style="color:#34bd26">int</span> i =
<span style="color:#c33720">0</span>; i < <span style="color:#c33720">1000</span>; i++) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        y *= i^<span style="color:#c33720">2</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    }</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">return</span> y*<span style="color:#c33720">2</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
}</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> bar() {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> y =
<span style="color:#c33720">0</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">for</span> (<span style="color:#34bd26">int</span> i =
<span style="color:#c33720">0</span>; i < <span style="color:#c33720">1000</span>; i++) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        y *= i^<span style="color:#c33720">2</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    }</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">return</span> y*<span style="color:#c33720">3</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
}</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(213,59,211)">
#define SIZE <span style="color:#c33720">10000000</span></p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bd26">int</span> main() {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#34bd26">int</span>* a = (<span style="color:#34bd26">int</span> *)calloc(SIZE,
<span style="color:#ce7924">sizeof</span>(<span style="color:#34bd26">int</span>));</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    a[<span style="color:#c33720">100</span>] = <span style="color:#c33720">1</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#34bd26">long</span> <span style="color:#34bd26">long</span> sum =
<span style="color:#c33720">0</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">for</span> (<span style="color:#34bd26">int</span> i =
<span style="color:#c33720">0</span>; i < SIZE; i++) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        <span style="color:#ce7924">if</span> (a[i] == <span style="color:#c33720">
1</span>) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
            sum += cold();</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        } <span style="color:#ce7924">else</span> <span style="color:#ce7924">if</span> (a[i] >
<span style="color:#c33720">1</span>) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
            sum += bar();</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
            sum += foo();</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        } <span style="color:#ce7924">else</span> <span style="color:#ce7924">if</span> (a[i] <
<span style="color:#c33720">1</span>) {</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
            sum += hot();</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
        }</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    }</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    cout << sum << endl;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
    <span style="color:#ce7924">return</span> <span style="color:#c33720">0</span>;</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
}</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
Makefile to compile the above code on Mac:</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#ce7924">.PHONY:</span> clean</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bbc7">regular:</span> main.cpp</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    clang++ -O2  main.cpp -o main.regular</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bbc7">hand:</span> main2.cpp</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    clang++ -O2  main2.cpp -o main.regular2</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bbc7">instr:</span> main.cpp</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    clang++ -O2 -fprofile-instr-generate main.cpp -o main.instr</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bbc7">profile:</span> main.instr</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    ./main.instr</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bbc7">merge:</span> default.profraw</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    xcrun llvm-profdata merge -output default.profdata default.profraw</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
<span style="color:#34bbc7">optimize:</span> default.profdata</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    clang++ -O2 -fprofile-instr-use=default.profdata main.cpp -o main.optimized</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';min-height:13px">
<br>
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(52,187,199)">
clean:</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console'">
</p>
<p style="margin:0px;font-size:13px;line-height:normal;font-family:'Lucida Console';color:rgb(195,55,32)">
    <span style="color:#34bbc7">$(RM)</span> default.* main.instr main.optimized main.regular</p>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div></div></div>
</div>

</blockquote></div><br></div></div>