[llvm-dev] Effectiveness of llvm optimisation passes

Thu Sep 21 22:21:11 PDT 2017

Thank you very much. That explains the results.

I am running the benchmarks again with '-Xclang -disable-O0-optnone'.

Thanks,
Yi

On 22/9/17 15:10, Craig Topper wrote:
> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang -disable-O0-optnone" 
> to your command line.
>
> ~Craig
>
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     Hi all,
>
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
>
>     The following is my approach (please correct me if I did anything
>     wrong):
>
>     I am trying to explicitly control the optimisations passes in
>     llvm. I disable optimisation in clang, but instead emit
>     unoptimized llvm IR, and use opt to optimise that. These are what
>     I do:
>
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
>
>     To evaluate the effectiveness of optimisation passes, I started
>     with an 'add-one-in' approach. The baseline is no optimisations
>     passes, and I iterate through all the O1 passes and explicitly
>     allow one pass for each run. I didnt try understand those passes
>     so it is a black box test. This will show how effective each
>     single optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always
>     enable it, and then 'add-one-in' for the rest passes. I also plan
>     to take a 'leave-one-out' approach as well, in which the baseline
>     is all optimisations enabled, and one pass will be disabled at a time.
>
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
>
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>     <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
>
>     The result seems a bit surprising. A few passes, such as licm,
>     sroa, instcombine and mem2reg, seem to deliver a very close
>     performance as O1 (which includes all the passes). Figure 7 is an
>     example. If my methodology is correct, then my guess is those
>     optimisations may require some common internal passes, which
>     actually deliver most of the improvements. I am wondering if this
>     is true.
>
>     Any suggestion or critiques are welcome.
>
>     Thanks,
>     Yi
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>