[llvm-dev] clang asm-goto support (Was Re: [PATCH v2] x86/retpoline: Add clang support)

Ingo Molnar via llvm-dev llvm-dev at lists.llvm.org
Wed Feb 14 15:07:18 PST 2018


* Ingo Molnar <mingo at kernel.org> wrote:

> To quantify it: I just performed a test build of a Linux distro kernel config 
> (Fedora x86-64), and counted the number of callsites that use 'asm goto' 
> functionality with the v4.15 kernel (including drivers).
> 
> The results:
> 
>                                                 Linux distro | !CONFIG_TRACING
>  -----------------------------------------------------------------------------
>  total # of functions                         :      191,567 |         184,443
>  total # of instructions                      :   14,251,355 |      13,526,112
>  -----------------------------------------------------------------------------
>  total # of spin_lock*() calls                :       25,246 |          25,177
>  total # of mutex_lock*() calls               :       13,062 |          12,861
>  total # of kmalloc*() calls                  :        5,148 |           5,118
>  -----------------------------------------------------------------------------
>  total # of 'asm goto' usage sites            :       34,851 |          31,059
>  total # of 'asm goto' using functions        :       18,209 |          16,089
>  -----------------------------------------------------------------------------
>  percent of kernel functions using 'asm goto' :         9.5% |            8.7%
>  -----------------------------------------------------------------------------

Here's the size stats of kernel/sched/built-in.o for the same distro config:

                                                     optimized |     no asm goto
   -----------------------------------------------------------------------------
   total # of functions                         :          765 |            764
   total # of instructions                      :       46,830 |         47,051

I.e. asm goto support reduces scheduler size by ~0.5%, which is a major generated 
code size reduction.

This doesn't count the live branch patching performance advantages: many of those 
asm goto usage sites are in hot paths, so the performance impact of it is much 
larger than that: easily a couple of percentage points in scheduler intensive 
benchmarks, as Peter mentioned.

For example here's a thread context switch benchmark comparison on a modern x86 
system running a v4.15 kernel:

  $ perf stat --repeat 20 --sync --null perf bench sched messaging -t -g 25

         no asm goto:     0.136778505 seconds time elapsed      ( stddev: +- 0.55% )
  asm goto optimized:     0.133773904 seconds time elapsed      ( stddev: +- 0.51% )

The asm goto enabled kernel is ~2.25% faster in this benchmark, and the 
performance penalty of not having asm goto support will only increase in the 
future.

i.e. it very much makes sense to implement asm goto support not just for 
compatibility reasons, but for performance reasons as well.

Thanks,

	Ingo


More information about the llvm-dev mailing list