[llvm-dev] Effectiveness of llvm optimisation passes

Tue Sep 26 00:04:40 PDT 2017

I feel I am still doing something wrong, as the performance do not seem 
to change with different passes I use.

The commandline I am using are:

* clang -O0 -Xclang -disable-O0-optnone -S -mllvm -disable-llvm-optzns 
-emit-llvm -momit-leaf-frame-pointer a.c -o a.ll
* opt -(PASS_FLAG) a.ll -o a.bc
* llc a.bc -filetype=obj -o a.o

I tried with PASS_FLAG as all passes from O1, a specific pass in O1, or 
directly use '-O1', '-O0'. The performance variation seems to be noise 
only (+/- 1%).

And clang is warning me about unused arguments for '-Xclang 
-disable-O0-optnone', though the result is different from not using the 
argument. I am using clang-5.0

Any help would be appreciated.

Thanks,
Yi

On 22/9/17 17:10, Craig Topper wrote:
> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang -disable-O0-optnone" 
> to your command line.
>
> ~Craig
>
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     Hi all,
>
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
>
>     The following is my approach (please correct me if I did anything
>     wrong):
>
>     I am trying to explicitly control the optimisations passes in
>     llvm. I disable optimisation in clang, but instead emit
>     unoptimized llvm IR, and use opt to optimise that. These are what
>     I do:
>
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
>
>     To evaluate the effectiveness of optimisation passes, I started
>     with an 'add-one-in' approach. The baseline is no optimisations
>     passes, and I iterate through all the O1 passes and explicitly
>     allow one pass for each run. I didnt try understand those passes
>     so it is a black box test. This will show how effective each
>     single optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always
>     enable it, and then 'add-one-in' for the rest passes. I also plan
>     to take a 'leave-one-out' approach as well, in which the baseline
>     is all optimisations enabled, and one pass will be disabled at a time.
>
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
>
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>     <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
>
>     The result seems a bit surprising. A few passes, such as licm,
>     sroa, instcombine and mem2reg, seem to deliver a very close
>     performance as O1 (which includes all the passes). Figure 7 is an
>     example. If my methodology is correct, then my guess is those
>     optimisations may require some common internal passes, which
>     actually deliver most of the improvements. I am wondering if this
>     is true.
>
>     Any suggestion or critiques are welcome.
>
>     Thanks,
>     Yi
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>