[llvm-dev] Replicate Individual O3 optimizations

Thu Oct 24 04:04:11 PDT 2019

I run matrix multiplication code with both the approaches o3 at clang and
o3 at opt. clang o3 is about 2.97x faster than opt o3.

On Mon, Oct 21, 2019 at 8:24 AM Neil Nelson <nnelson at infowest.com> wrote:

> is_sorted.cpp
> bool is_sorted(int *a, int n) {
>
>   for (int i = 0; i < n - 1; i++)
>
>     if (a[i] > a[i + 1])
>       return false;
>   return true;
> }
>
> https://blog.regehr.org/archives/1605 How Clang Compiles a Functionhttps://blog.regehr.org/archives/1603 How LLVM Optimizes a Function
> clang version 10.0.0, Xubuntu 19.04
>
> clang is_sorted.cpp -S -emit-llvm -o is_sorted_.ll
> clang is_sorted.cpp -O0 -S -emit-llvm -o is_sorted_O0.ll
> clang is_sorted.cpp -O0 -Xclang -disable-llvm-passes -S -emit-llvm -o is_sorted_disable.ll
>
> No difference in the prior three ll files.
>
> clang is_sorted.cpp -O1 -S -emit-llvm -o is_sorted_O1.ll
>
> Many differences between is_sorted_O1.ll and is_sorted_.ll.
>
> opt -O3 -S is_sorted_.ll -o is_sorted_optO3.ll
>
> clang is_sorted.cpp -mllvm -debug-pass=Arguments -O3 -S -emit-llvm -o is_sorted_O3arg.ll
> opt <optimization sequence obtained in prior step> -S is_sorted_.ll -o is_sorted_opt_parms.ll
>
> No difference between is_sorted_optO3.ll and is_sorted_opt_parms.ll, the last two opt runs.
> Many differences between is_sorted_O3arg.ll and is_sorted_opt_parms.ll, the last two runs,
> clang and opt.
>
> Conclusions:
>
> Given my current understanding, the ll files from the first three clang runs
> are before any optimizations. Those ll files are from the front-end phase (CFE).
> But this is a simple program and it may be that for a more complex program that
> the ll files could be different.
>
> Whether or not we use a -O3 optimization or use the parameters provided by clang for a
> -03 optimization, we obtain the same result.
>
> The difference in question is why an opt run using the CFE ll before optimization
> obtains a different ll than a CFE run that includes optimization. That is, for this case,
> it is not the expansion of the -O3 parameters that is the difference.
>
> Initially, it would be interesting to have an ll listing before optimization from the
> clang run that includes optimization to compare with the ll from the clang run without
> optimization.
>
> Neil Nelson
>
> On 10/19/19 11:48 AM, Mehdi AMINI via llvm-dev wrote:
>
>
>
> On Thu, Oct 17, 2019 at 11:22 AM David Greene via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> hameeza ahmed via llvm-dev <llvm-dev at lists.llvm.org> writes:
>>
>> > Hello,
>> > I want to study the individual O3 optimizations. For this I am using
>> > following commands, but unable to replicate O3 behavior.
>> >
>> > 1. Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang
>> -O1
>> > -Xclang -disable-llvm-passes -emit-llvm -S vecsum.c -o vecsum-noopt.ll
>> >
>> > 2. Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang
>> -O3
>> > -mllvm -debug-pass=Arguments -emit-llvm -S vecsum.c
>> >
>> > 3. Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/opt
>> > <optimization sequence obtained in step 2> -S vecsum-noopt.ll -S -o
>> > o3-chk.ll
>> >
>> > Why the IR obtained by above step i.e individual O3 sequences, is not
>> same
>> > when O3 is passed?
>> >
>> > Where I am doing mistake?
>>
>
> If you could provide the full reproducer, it could help to debug this.
>
>
>>
>> I think you need to turn off LLVM optimizations when doing the
>> -emit-llvm dump.  Something like this:
>>
>> Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O3 \
>>   -mllvm -debug-pass=Arguments -Xclang -disable-llvm-optzns -emit-llvm \
>>   -S vecsum.c
>>
>> Otherwise you are effectively running the O3 pipeline twice, as clang
>> will emit LLVM IR after optimization, not before (this confused me too
>> when I first tried it).
>>
>
> This is the common pitfall indeed!
> I think they are doing it correctly in step 1 though by including:
> `-Xclang -disable-llvm-passes`.
>
>
> That said, I'm not sure you will get the same IR out of opt as with
>> clang -O3 even with the above.  For example, clang sets
>> TargetTransformInfo for the pass pipeline and the detailed information
>> it uses may or may not be transmitted via the IR it dumps out.  I have
>> not personally tried to do this kind of thing in a while.
>
>
> I struggled as well to setup TTI and TLI the same way clang does :(
> It'd be nice to revisit our PassManagerBuilder setup and the opt
> integration to provide reproducibility (maybe could be a starter project
> for someone?).
>
> --
> Mehdi
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191024/2a1f30f4/attachment.html>