<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p><font size="-1">Yes, this is another indication that there some

        processing or bridge in the clang -O3 compile not so far

        evidenced as well when compiling with clang to its IR before the

        optimization passes.</font></p>

    <p><font size="-1">This may be an issue explained in a yet to be

        known documentation page. Or it may be a point at the moment

        overlooked by the well informed.</font></p>

    <p><font size="-1">An issue being noted here but not well addressed

        is that a well stated design of LLVM with its front-ends and

        back ends is that the front-ends compile to an IR without

        optimization that LLVM uses for optimization and preparation for

        various back-ends. But that with clang -O3, given this evidence,

        we are not easily seeing how the division between the clang

        front end and </font><font size="-1"><font size="-1">LLVM</font>

        works, though the assumed design suggests it should be quite

        easy.</font></p>

    <p><font size="-1">We should be able to compile with clang to the IR

        before optimization and then apply the LLVM optimization

        separately to obtain the same final IR as a clang -O3 compile

        doing both of those. But we are not seeing that.</font></p>

    <p><font size="-1">This also bears on the e2e thread in that this

        assumed division posits that the separate clang and LLVM debug

        sequences can provide a high reliability since the IR

        intermediate between the two is not expected to be that error

        prone. The errors are expected to be primarily either in clang

        in obtaining a correct IR or in opt (LLVM) in optimizing that IR

        for the back-end. But since we are not able to identify the IR

        between the two under clang -O3 it is a question as to what

        debug sequence would handle what we could not identify.</font></p>

    <p><font size="-1">Neil Nelson<br>

      </font></p>

    <div class="moz-cite-prefix"><font size="-1">On 10/24/19 5:04 AM,

        hameeza ahmed wrote:</font><br>

    </div>

    <blockquote type="cite"

cite="mid:CAFMPKeYmRvNpJsmhq9v83PSsMJ8Eg_bd7BC0RatNj6G=xkJZCw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div><font size="-1">I run matrix multiplication code with both

            the approaches o3 at clang and o3 at opt. clang o3 is about

            2.97x faster than opt o3.</font></div>

        <div><font size="-1"><br>

          </font></div>

        <div><font size="-1"><br>

          </font></div>

      </div>

      <font size="-1"><br>

      </font>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr"><font size="-1">On Mon, Oct

            21, 2019 at 8:24 AM Neil Nelson <<a

              href="mailto:nnelson@infowest.com" moz-do-not-send="true">nnelson@infowest.com</a>>

            wrote:</font><br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div bgcolor="#FFFFFF">

            <pre><code>is_sorted.cpp

bool</code> <code>is_sorted(</code><code>int</code> <code>*a, </code><code>int</code> <code>n) {</code><code>

  </code><code>for</code> <code>(</code><code>int</code> <code>i = 0; i < n - 1; i++)</code></pre>

            <div>

              <div><code>    </code><code>if</code> <code>(a[i] >

                  a[i + 1])</code></div>

              <div><code>      </code><code>return</code> <code>false</code><code>;</code></div>

              <div><code>  </code><code>return</code> <code>true</code><code>;</code></div>

              <div><code>}</code><br>

              </div>

            </div>

            <pre><a href="https://blog.regehr.org/archives/1605" target="_blank" moz-do-not-send="true">https://blog.regehr.org/archives/1605</a> How Clang Compiles a Function

<a href="https://blog.regehr.org/archives/1603" target="_blank" moz-do-not-send="true">https://blog.regehr.org/archives/1603</a> How LLVM Optimizes a Function

clang version 10.0.0, Xubuntu 19.04

clang is_sorted.cpp -S -emit-llvm -o is_sorted_.ll

clang is_sorted.cpp -O0 -S -emit-llvm -o is_sorted_O0.ll

clang is_sorted.cpp -O0 -Xclang -disable-llvm-passes -S -emit-llvm -o is_sorted_disable.ll

No difference in the prior three ll files.

clang is_sorted.cpp -O1 -S -emit-llvm -o is_sorted_O1.ll

Many differences between is_sorted_O1.ll and is_sorted_.ll.

opt -O3 -S is_sorted_.ll -o is_sorted_optO3.ll

clang is_sorted.cpp -mllvm -debug-pass=Arguments -O3 -S -emit-llvm -o is_sorted_O3arg.ll

opt <optimization sequence obtained in prior step> -S is_sorted_.ll -o is_sorted_opt_parms.ll

No difference between is_sorted_optO3.ll and is_sorted_opt_parms.ll, the last two opt runs.

Many differences between is_sorted_O3arg.ll and is_sorted_opt_parms.ll, the last two runs,

clang and opt.

Conclusions:

Given my current understanding, the ll files from the first three clang runs

are before any optimizations. Those ll files are from the front-end phase (CFE).

But this is a simple program and it may be that for a more complex program that

the ll files could be different.

Whether or not we use a -O3 optimization or use the parameters provided by clang for a 

-03 optimization, we obtain the same result.

The difference in question is why an opt run using the CFE ll before optimization

obtains a different ll than a CFE run that includes optimization. That is, for this case,

it is not the expansion of the -O3 parameters that is the difference.

Initially, it would be interesting to have an ll listing before optimization from the

clang run that includes optimization to compare with the ll from the clang run without

optimization.

Neil Nelson

On 10/19/19 11:48 AM, Mehdi AMINI via llvm-dev wrote:

</pre>

            <blockquote type="cite">

              <div dir="ltr">

                <div dir="ltr">

                  <div dir="ltr"><br>

                  </div>

                  <br>

                  <div class="gmail_quote">

                    <div dir="ltr" class="gmail_attr">On Thu, Oct 17,

                      2019 at 11:22 AM David Greene via llvm-dev <<a

                        href="mailto:llvm-dev@lists.llvm.org"

                        target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>

                      wrote:<br>

                    </div>

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex">hameeza ahmed

                      via llvm-dev <<a

                        href="mailto:llvm-dev@lists.llvm.org"

                        target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>

                      writes:<br>

                      <br>

                      > Hello,<br>

                      > I want to study the individual O3

                      optimizations. For this I am using<br>

                      > following commands, but unable to replicate

                      O3 behavior.<br>

                      ><br>

                      > 1.

                      Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang

                      -O1<br>

                      > -Xclang -disable-llvm-passes -emit-llvm -S

                      vecsum.c -o vecsum-noopt.ll<br>

                      ><br>

                      > 2.

                      Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang

                      -O3<br>

                      > -mllvm -debug-pass=Arguments -emit-llvm -S

                      vecsum.c<br>

                      ><br>

                      > 3.

                      Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/opt<br>

                      > <optimization sequence obtained in step

                      2> -S vecsum-noopt.ll -S -o<br>

                      > o3-chk.ll<br>

                      ><br>

                      > Why the IR obtained by above step i.e

                      individual O3 sequences, is not same<br>

                      > when O3 is passed?<br>

                      ><br>

                      > Where I am doing mistake?<br>

                    </blockquote>

                    <div><br>

                    </div>

                    <div>If you could provide the full reproducer, it

                      could help to debug this.</div>

                    <div> <br>

                    </div>

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex"><br>

                      I think you need to turn off LLVM optimizations

                      when doing the<br>

                      -emit-llvm dump.  Something like this:<br>

                      <br>

Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O3 \<br>

                        -mllvm -debug-pass=Arguments -Xclang

                      -disable-llvm-optzns -emit-llvm \<br>

                        -S vecsum.c<br>

                      <br>

                      Otherwise you are effectively running the O3

                      pipeline twice, as clang<br>

                      will emit LLVM IR after optimization, not before

                      (this confused me too<br>

                      when I first tried it).<br>

                    </blockquote>

                    <div><br>

                    </div>

                    <div>This is the common pitfall indeed!</div>

                    <div>I think they are doing it correctly in step 1

                      though by including: `-Xclang

                      -disable-llvm-passes`.</div>

                    <div><br>

                    </div>

                    <div><br>

                    </div>

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex">That said, I'm

                      not sure you will get the same IR out of opt as

                      with<br>

                      clang -O3 even with the above.  For example, clang

                      sets<br>

                      TargetTransformInfo for the pass pipeline and the

                      detailed information<br>

                      it uses may or may not be transmitted via the IR

                      it dumps out.  I have<br>

                      not personally tried to do this kind of thing in a

                      while.</blockquote>

                    <div><br>

                    </div>

                    <div>I struggled as well to setup TTI and TLI the

                      same way clang does :(</div>

                    <div>It'd be nice to revisit our PassManagerBuilder

                      setup and the opt integration to provide

                      reproducibility (maybe could be a starter project

                      for someone?).</div>

                    <div><br>

                    </div>

                    <div>-- </div>

                    <div>Mehdi</div>

                    <div><br>

                    </div>

                  </div>

                </div>

              </div>

              <br>

              <fieldset></fieldset>

              <pre>_______________________________________________

LLVM Developers mailing list

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

            </blockquote>

          </div>

        </blockquote>

      </div>

    </blockquote>

  </body>

</html>