[PATCH] D53736: [BTF] Add BTF DebugInfo

Yonghong Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 25 15:45:41 PDT 2018


yonghong-song created this revision.
yonghong-song added reviewers: aprantl, dblaikie, echristo, ast.
yonghong-song added a project: debug-info.
Herald added subscribers: llvm-commits, JDevlieghere, mgorny.

This patch adds BPF Debug Format (BTF) as a standalone
LLVM debuginfo. The BTF related sections are directly
generated from IR. The BTF debuginfo is generated
only when the compilation target is BPF.

What is BTF?
============

First, the BPF is a linux kernel virtual machine
and widely used for tracing, networking and security.

  https://www.kernel.org/doc/Documentation/networking/filter.txt
  https://cilium.readthedocs.io/en/v1.2/bpf/

BTF is the debug info format for BPF, introduced in the below
linux patch

  https://github.com/torvalds/linux/commit/69b693f0aefa0ed521e8bd02260523b5ae446ad7#diff-06fb1c8825f653d7e539058b72c83332

in the patch set mentioned in the below lwn article.

  https://lwn.net/Articles/752047/

The BTF format is specified in the above github commit.
In summary, its layout looks like

  struct btf_header
  type subsection (a list of types)
  string subsection (a list of strings)

With such information, the kernel and the user space is able to
pretty print a particular bpf map key/value. One possible example below:

  Withtout BTF:
    key: [ 0x01, 0x01, 0x00, 0x00 ]
  With BTF:
    key: struct t { a : 1; b : 1; c : 0}
  where struct is defined as
    struct t { char a; char b; short c; };

How BTF is generated?
=====================

Currently, the BTF is generated through pahole.

  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=68645f7facc2eb69d0aeb2dd7d2f0cac0feb4d69

and available in pahole v1.12

  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=4a21c5c8db0fcd2a279d067ecfb731596de822d4

Basically, the bpf program needs to be compiled with -g with
dwarf sections generated. The pahole is enhanced such that
a .BTF section can be generated based on dwarf. This format
of the .BTF section matches the format expected by
the kernel, so a bpf loader can just take the .BTF section
and load it into the kernel.

  https://github.com/torvalds/linux/commit/8a138aed4a807ceb143882fb23a423d524dcdb35

The .BTF section layout is also specified in this patch:
with file include/llvm/BinaryFormat/BTF.h.

What use cases this patch tries to address?
===========================================

Currently, only the bpf instruction stream is required to
pass to the kernel. The kernel verifies it, jits it if configured
to do so, attaches it to a particular kernel attachment point,
and later executes when a particular event happens.

This patch tries to expand BTF to support two more use cases below:

  (1). BPF supports subroutine calls.
       During performance analysis, it would be good to
       differentiate which call is hot instead of just
       providing a virtual address. This would require to
       pass a unique identifier for each subroutine to
       the kernel, the subroutine name is a natual choice.
  (2). If a particular jitted instruction is hot, we want
       user to know which source line this jitted instruction
       belongs to. This would require the source information
       is available to various profiling tools.

Note that in a single ELF file,

  . there may be multiple loadable bpf programs,
  . for a particular to-be-loaded bpf instruction stream,
    its instructions may come from multiple PROGBITS sections,
    the bpf loader needs to merge them together to a single
    consecutive insn stream before loading to the kernel.

For example:

  section .text: subroutines funcFoo
  section _progA: calling funcFoo
  section _progB: calling funcFoo

The bpf loader could construct two loadable bpf instruction
streams and load them into the kernel:

  . _progA funcFoo
  . _progB funcFoo

So per ELF section function offset and instruction offset
will need to be adjusted before passing to the kernel, and
the kernel essentially expect only one code section regardless
of how many in the ELF file.

What do we propose and Why?
===========================

To support the above two use cases, we propose to
add an additional section, .BTF.ext, to the ELF file
which is the input of the bpf loader. A different section
is preferred since loader may need to manipulate it before
loading part of its data to the kernel.

The .BTF.ext section has a similar header to the .BTF section
and it contains two subsections for func_info and line_info.

  . the func_info maps the func insn byte offset to a func
    type in the .BTF type subsection.
  . the line_info maps the insn byte offset to a line info.
  . both func_info and line_info subsections are organized
    by ELF PROGBITS AX sections.

pahole is not a good place to implement .BTF.ext as
pahole is mostly for structure hole information and more
importantly, we want to pass the actual code to the kernel.

  . bpf program typically is small so storage overhead
    should be small.
  . in bpf land, it is totally possible that
    an application loads the bpf program into the
    kernel and then that application quits, so
    holding debug info by the user space application
    is not practical as you may not even know who
    loads this bpf program.
  . having source codes directly kept by kernel
    would ease deployment since the original source
    code does not need ship on every hosts and
    kernel-devel package does not need to be
    deployed even if kernel headers are used.

LLVM is a good place to implement.

  . The only reliable time to get the source code is
    during compilation time. This will result in both more
    accurate information and easier deployment as
    stated in the above.
  . Another consideration is for JIT. The project like bcc
    (https://github.com/iovisor/bcc)
    use MCJIT to compile a C program into bpf insns and
    load them to the kernel. The llvm generated BTF sections
    will be readily available for such cases as well.

Design and implementation of emiting .BTF/.BTF.ext sections
===========================================================

The BTF debuginfo format is defined. Both .BTF and .BTF.ext
sections are generated directly from IR when both
"-target bpf" and "-g" are specified. Note that
dwarf sections are still generated as dwarf is used
by user space tools like llvm-objdump etc. for BPF target.

This patch also contains tests to verify generated
.BTF and .BTF.ext sections for all supported types, func_info
and line_info subsections. The patch is also tested
against linux kernel bpf sample tests and selftests.

Signed-off-by: Yonghong Song <yhs at fb.com>


Repository:
  rL LLVM

https://reviews.llvm.org/D53736

Files:
  include/llvm/BinaryFormat/BTF.def
  include/llvm/BinaryFormat/BTF.h
  include/llvm/MC/MCObjectFileInfo.h
  lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  lib/CodeGen/AsmPrinter/BTFDebug.cpp
  lib/CodeGen/AsmPrinter/BTFDebug.h
  lib/CodeGen/AsmPrinter/CMakeLists.txt
  lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
  lib/CodeGen/AsmPrinter/DebugHandlerBase.h
  lib/MC/MCObjectFileInfo.cpp
  test/DebugInfo/BTF/array-1d-char.ll
  test/DebugInfo/BTF/array-1d-int.ll
  test/DebugInfo/BTF/array-2d-int.ll
  test/DebugInfo/BTF/array-size-0.ll
  test/DebugInfo/BTF/array-typedef.ll
  test/DebugInfo/BTF/binary-format.ll
  test/DebugInfo/BTF/char.ll
  test/DebugInfo/BTF/enum-basic.ll
  test/DebugInfo/BTF/func-func-ptr.ll
  test/DebugInfo/BTF/func-non-void.ll
  test/DebugInfo/BTF/func-source.ll
  test/DebugInfo/BTF/func-typedef.ll
  test/DebugInfo/BTF/func-void.ll
  test/DebugInfo/BTF/fwd-no-define.ll
  test/DebugInfo/BTF/fwd-with-define.ll
  test/DebugInfo/BTF/int.ll
  test/DebugInfo/BTF/lit.local.cfg
  test/DebugInfo/BTF/longlong.ll
  test/DebugInfo/BTF/ptr-const-void.ll
  test/DebugInfo/BTF/ptr-func-1.ll
  test/DebugInfo/BTF/ptr-func-2.ll
  test/DebugInfo/BTF/ptr-func-3.ll
  test/DebugInfo/BTF/ptr-int.ll
  test/DebugInfo/BTF/ptr-void.ll
  test/DebugInfo/BTF/ptr-volatile-const-void.ll
  test/DebugInfo/BTF/ptr-volatile-void.ll
  test/DebugInfo/BTF/restrict-ptr.ll
  test/DebugInfo/BTF/short.ll
  test/DebugInfo/BTF/struct-anon.ll
  test/DebugInfo/BTF/struct-basic.ll
  test/DebugInfo/BTF/struct-bitfield-typedef.ll
  test/DebugInfo/BTF/struct-enum.ll
  test/DebugInfo/BTF/uchar.ll
  test/DebugInfo/BTF/uint.ll
  test/DebugInfo/BTF/ulonglong.ll
  test/DebugInfo/BTF/union-array-typedef.ll
  test/DebugInfo/BTF/ushort.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D53736.171208.patch
Type: text/x-patch
Size: 165139 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181025/1e56e4a5/attachment.bin>


More information about the llvm-commits mailing list