[llvm-dev] Effectiveness of llvm optimisation passes

Thu Sep 21 22:14:58 PDT 2017

Craig was faster on the optnone flag (if you are using Clang 5 and above).
However, I observed that some of the opt passes ignore the optnone in 
some cases, e.g., -breack-crit-edge.
You can use the -stats flag from opt to get a list of statistics what a 
particular pass did (if it collects statistics of course).

On 22.09.2017 07:11, Craig Topper via llvm-dev wrote:
> Have -O0 on your clang command line causes all functions to get marked 
> with an 'optnone' attribute that prevents opt from being able to 
> optimize them later. You should also add "-Xclang -disable-O0-optnone" 
> to your command line.
> 
> ~Craig
> 
> On Thu, Sep 21, 2017 at 10:04 PM, Yi Lin via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
>     Hi all,
> 
>     I am trying to understand the effectiveness of various llvm
>     optimisations when a language targets llvm (or C) as its backend.
> 
>     The following is my approach (please correct me if I did anything
>     wrong):
> 
>     I am trying to explicitly control the optimisations passes in llvm.
>     I disable optimisation in clang, but instead emit unoptimized llvm
>     IR, and use opt to optimise that. These are what I do:
> 
>     * clang -O0 -S -mllvm -disable-llvm-optzns -emit-llvm
>     -momit-leaf-frame-pointer a.c -o a.ll
>     * opt -(PASSES) a.ll -o a.bc
>     * llc a.bc -filetype=obj -o a.o
> 
>     To evaluate the effectiveness of optimisation passes, I started with
>     an 'add-one-in' approach. The baseline is no optimisations passes,
>     and I iterate through all the O1 passes and explicitly allow one
>     pass for each run. I didnt try understand those passes so it is a
>     black box test. This will show how effective each single
>     optimisation is (ignore correlation of passes). This can be
>     iterative, e.g. identify the most effecitve pass, and always enable
>     it, and then 'add-one-in' for the rest passes. I also plan to take a
>     'leave-one-out' approach as well, in which the baseline is all
>     optimisations enabled, and one pass will be disabled at a time.
> 
>     Here is the result for the 'add-one-in' approach on some micro
>     benchmarks:
> 
>     https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0
>     <https://drive.google.com/drive/folders/0B9EKhGby1cv9YktaS3NxUVg2Zk0>
> 
>     The result seems a bit surprising. A few passes, such as licm, sroa,
>     instcombine and mem2reg, seem to deliver a very close performance as
>     O1 (which includes all the passes). Figure 7 is an example. If my
>     methodology is correct, then my guess is those optimisations may
>     require some common internal passes, which actually deliver most of
>     the improvements. I am wondering if this is true.
> 
>     Any suggestion or critiques are welcome.
> 
>     Thanks,
>     Yi
> 
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
>