[PATCH] D53736: [BTF] Add BTF DebugInfo
Yonghong Song via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 25 15:45:41 PDT 2018
yonghong-song created this revision.
yonghong-song added reviewers: aprantl, dblaikie, echristo, ast.
yonghong-song added a project: debug-info.
Herald added subscribers: llvm-commits, JDevlieghere, mgorny.
This patch adds BPF Debug Format (BTF) as a standalone
LLVM debuginfo. The BTF related sections are directly
generated from IR. The BTF debuginfo is generated
only when the compilation target is BPF.
What is BTF?
============
First, the BPF is a linux kernel virtual machine
and widely used for tracing, networking and security.
https://www.kernel.org/doc/Documentation/networking/filter.txt
https://cilium.readthedocs.io/en/v1.2/bpf/
BTF is the debug info format for BPF, introduced in the below
linux patch
https://github.com/torvalds/linux/commit/69b693f0aefa0ed521e8bd02260523b5ae446ad7#diff-06fb1c8825f653d7e539058b72c83332
in the patch set mentioned in the below lwn article.
https://lwn.net/Articles/752047/
The BTF format is specified in the above github commit.
In summary, its layout looks like
struct btf_header
type subsection (a list of types)
string subsection (a list of strings)
With such information, the kernel and the user space is able to
pretty print a particular bpf map key/value. One possible example below:
Withtout BTF:
key: [ 0x01, 0x01, 0x00, 0x00 ]
With BTF:
key: struct t { a : 1; b : 1; c : 0}
where struct is defined as
struct t { char a; char b; short c; };
How BTF is generated?
=====================
Currently, the BTF is generated through pahole.
https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=68645f7facc2eb69d0aeb2dd7d2f0cac0feb4d69
and available in pahole v1.12
https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=4a21c5c8db0fcd2a279d067ecfb731596de822d4
Basically, the bpf program needs to be compiled with -g with
dwarf sections generated. The pahole is enhanced such that
a .BTF section can be generated based on dwarf. This format
of the .BTF section matches the format expected by
the kernel, so a bpf loader can just take the .BTF section
and load it into the kernel.
https://github.com/torvalds/linux/commit/8a138aed4a807ceb143882fb23a423d524dcdb35
The .BTF section layout is also specified in this patch:
with file include/llvm/BinaryFormat/BTF.h.
What use cases this patch tries to address?
===========================================
Currently, only the bpf instruction stream is required to
pass to the kernel. The kernel verifies it, jits it if configured
to do so, attaches it to a particular kernel attachment point,
and later executes when a particular event happens.
This patch tries to expand BTF to support two more use cases below:
(1). BPF supports subroutine calls.
During performance analysis, it would be good to
differentiate which call is hot instead of just
providing a virtual address. This would require to
pass a unique identifier for each subroutine to
the kernel, the subroutine name is a natual choice.
(2). If a particular jitted instruction is hot, we want
user to know which source line this jitted instruction
belongs to. This would require the source information
is available to various profiling tools.
Note that in a single ELF file,
. there may be multiple loadable bpf programs,
. for a particular to-be-loaded bpf instruction stream,
its instructions may come from multiple PROGBITS sections,
the bpf loader needs to merge them together to a single
consecutive insn stream before loading to the kernel.
For example:
section .text: subroutines funcFoo
section _progA: calling funcFoo
section _progB: calling funcFoo
The bpf loader could construct two loadable bpf instruction
streams and load them into the kernel:
. _progA funcFoo
. _progB funcFoo
So per ELF section function offset and instruction offset
will need to be adjusted before passing to the kernel, and
the kernel essentially expect only one code section regardless
of how many in the ELF file.
What do we propose and Why?
===========================
To support the above two use cases, we propose to
add an additional section, .BTF.ext, to the ELF file
which is the input of the bpf loader. A different section
is preferred since loader may need to manipulate it before
loading part of its data to the kernel.
The .BTF.ext section has a similar header to the .BTF section
and it contains two subsections for func_info and line_info.
. the func_info maps the func insn byte offset to a func
type in the .BTF type subsection.
. the line_info maps the insn byte offset to a line info.
. both func_info and line_info subsections are organized
by ELF PROGBITS AX sections.
pahole is not a good place to implement .BTF.ext as
pahole is mostly for structure hole information and more
importantly, we want to pass the actual code to the kernel.
. bpf program typically is small so storage overhead
should be small.
. in bpf land, it is totally possible that
an application loads the bpf program into the
kernel and then that application quits, so
holding debug info by the user space application
is not practical as you may not even know who
loads this bpf program.
. having source codes directly kept by kernel
would ease deployment since the original source
code does not need ship on every hosts and
kernel-devel package does not need to be
deployed even if kernel headers are used.
LLVM is a good place to implement.
. The only reliable time to get the source code is
during compilation time. This will result in both more
accurate information and easier deployment as
stated in the above.
. Another consideration is for JIT. The project like bcc
(https://github.com/iovisor/bcc)
use MCJIT to compile a C program into bpf insns and
load them to the kernel. The llvm generated BTF sections
will be readily available for such cases as well.
Design and implementation of emiting .BTF/.BTF.ext sections
===========================================================
The BTF debuginfo format is defined. Both .BTF and .BTF.ext
sections are generated directly from IR when both
"-target bpf" and "-g" are specified. Note that
dwarf sections are still generated as dwarf is used
by user space tools like llvm-objdump etc. for BPF target.
This patch also contains tests to verify generated
.BTF and .BTF.ext sections for all supported types, func_info
and line_info subsections. The patch is also tested
against linux kernel bpf sample tests and selftests.
Signed-off-by: Yonghong Song <yhs at fb.com>
Repository:
rL LLVM
https://reviews.llvm.org/D53736
Files:
include/llvm/BinaryFormat/BTF.def
include/llvm/BinaryFormat/BTF.h
include/llvm/MC/MCObjectFileInfo.h
lib/CodeGen/AsmPrinter/AsmPrinter.cpp
lib/CodeGen/AsmPrinter/BTFDebug.cpp
lib/CodeGen/AsmPrinter/BTFDebug.h
lib/CodeGen/AsmPrinter/CMakeLists.txt
lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
lib/CodeGen/AsmPrinter/DebugHandlerBase.h
lib/MC/MCObjectFileInfo.cpp
test/DebugInfo/BTF/array-1d-char.ll
test/DebugInfo/BTF/array-1d-int.ll
test/DebugInfo/BTF/array-2d-int.ll
test/DebugInfo/BTF/array-size-0.ll
test/DebugInfo/BTF/array-typedef.ll
test/DebugInfo/BTF/binary-format.ll
test/DebugInfo/BTF/char.ll
test/DebugInfo/BTF/enum-basic.ll
test/DebugInfo/BTF/func-func-ptr.ll
test/DebugInfo/BTF/func-non-void.ll
test/DebugInfo/BTF/func-source.ll
test/DebugInfo/BTF/func-typedef.ll
test/DebugInfo/BTF/func-void.ll
test/DebugInfo/BTF/fwd-no-define.ll
test/DebugInfo/BTF/fwd-with-define.ll
test/DebugInfo/BTF/int.ll
test/DebugInfo/BTF/lit.local.cfg
test/DebugInfo/BTF/longlong.ll
test/DebugInfo/BTF/ptr-const-void.ll
test/DebugInfo/BTF/ptr-func-1.ll
test/DebugInfo/BTF/ptr-func-2.ll
test/DebugInfo/BTF/ptr-func-3.ll
test/DebugInfo/BTF/ptr-int.ll
test/DebugInfo/BTF/ptr-void.ll
test/DebugInfo/BTF/ptr-volatile-const-void.ll
test/DebugInfo/BTF/ptr-volatile-void.ll
test/DebugInfo/BTF/restrict-ptr.ll
test/DebugInfo/BTF/short.ll
test/DebugInfo/BTF/struct-anon.ll
test/DebugInfo/BTF/struct-basic.ll
test/DebugInfo/BTF/struct-bitfield-typedef.ll
test/DebugInfo/BTF/struct-enum.ll
test/DebugInfo/BTF/uchar.ll
test/DebugInfo/BTF/uint.ll
test/DebugInfo/BTF/ulonglong.ll
test/DebugInfo/BTF/union-array-typedef.ll
test/DebugInfo/BTF/ushort.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D53736.171208.patch
Type: text/x-patch
Size: 165139 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181025/1e56e4a5/attachment.bin>
More information about the llvm-commits
mailing list