[PATCH] D149058: [BPF][DebugInfo] Use .BPF.ext for line info when DWARF is not available

Eduard Zingerman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 1 16:32:49 PDT 2023

eddyz87 created this revision.
Herald added subscribers: mgrang, hiraditya.
Herald added a reviewer: jhenderson.
Herald added a reviewer: MaskRay.
Herald added a project: All.
eddyz87 updated this revision to Diff 517440.
eddyz87 added a comment.
eddyz87 updated this revision to Diff 518092.
eddyz87 edited the summary of this revision.
eddyz87 updated this revision to Diff 518565.
eddyz87 edited the summary of this revision.
eddyz87 updated this revision to Diff 518581.
eddyz87 published this revision for review.
eddyz87 added a reviewer: yonghong-song.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Reorganized to report errors, added objdump test-case.

eddyz87 added a comment.

Reorganized to provide BTFParser class, added unit tests.

yonghong-song added a comment.

The following is the immediate use case why this patch will help BPF community.
Currently bpftool implements static linking as lld does not support it. The static linking looks like below:

  1. Let us say we have three files t1.c, t2.c, t3.c, they call compiled with '-target bpf -O2 -mcpu=v3 -g -c' and generates t1.o, t2.0, t3.o
  2. To minimize runtime overhead, we would like to produce just one binary say t_final.o which contains a single BTF section and a single BTF.ext section, and this is bpf linking step implemented in bpftool.
1. The t_final.o is used by applicaiton (libbpf library) to extract necessary insn/relocation/btf etc. and process them and load them into the kernel.

The step 3 may be repeated many times (application restart, etc., repeated run e.g., for tracing app, etc.) so it is critical that bpf linking (bpftool) can generate a .o file which can be processed and loaded into kernel as faster as possible.

So the step 2 does the following with bpftool:

- merging all code sections with all .o files and do necessary relocation
- merging all .BTF/.BTF.ext sections into a single one, not that to save space, deduplication is done to minimize BTF size.
- remove dwarf sections as they are not cirtical to BPF program execution. Also, with bpf skeleton, t_final.o is presented as a blob of data and embedded as a special string constant and taking up application memory space. So removing dwarf data can save application memory usage as well.

The end result is that t_final.o does not have dwarf sections and only have .BTF/.BTF.ext sections. The BTF sections contains line number info and verifier uses such information to deplay verifier logs annotated with source code.
But llvm-objdump -S won't work any more. We would like it works with BTF to show source annotated asm code if dwarf is not available.

eddyz87 added a comment.

Unit test changed to use `#pragma pack` instead of __attribute__((packed)) to satisfy MSVC unit tests build.

eddyz87 added a comment.

Fix incorrect path substitution for interleaved-source-test.ll on windows.

eddyz87 added a comment.

Hi James, Fangrui,

Could you please take a look at this modification for `llvm-objdump`?
This change would be quite helpful for people developing BPF programs using CLANG.

"BTF" is a debug information format used by LLVM's BPF backend.
The format is much smaller in scope than DWARF, the following info is

- full set of C types used in the binary file;
- types for global values;
- line number / line source code information .

BTF information is embedded in ELF as .BTF and .BTF.ext sections.
Detailed format description could be found as a part of Linux Source
tree, e.g. here: [1].

This commit modifies `llvm-objdump` utility to use line number
information provided by BTF if DWARF information is not available.
(For example, when `llvm-strip` is used to minimize binary files, BTF
is significantly more compact than DWARF. This is a common production

Basically, the goal is to make the following to print source code
lines, interleaved with disassembly:

  $ clang -target bpf -g test.c -o test.o
  $ llvm-strip --strip-debug test.o
  $ llvm-objdump -Sd test.o
  test.o:	file format elf64-bpf
  Disassembly of section .text:
  ; void foo(void) {
  	r1 = 0x1
  ;   consume(1);
  	call -0x1
  	r1 = 0x2
  ;   consume(2);
  	call -0x1
  ; }

The commit consists of the following modifications:

- llvm/lib/DebugInfo/BTF aka `DebugInfoBTF` component is added to host the code needed to process BTF (with assumption that BTF support would be added to some other tools as well, e.g. `llvm-readelf`):
  - `DebugInfoBTF` provides `llvm::BTFParser` class, that loads information from `.BTF` and `.BTF.ext` sections of a given `object::ObjectFile` instance and allows to query this information. Currently only line number information is loaded.
  - `DebugInfoBTF` also provides `llvm::BTFContext` class, which is an implementation of `DIContext` interface, used by `llvm-objdump` to query information about line numbers corresponding to specific instructions.

- Structure `DILineInfo` is modified with field `LineSource`.

  `DIContext` interface uses `DILineInfo` structure to communicate line number and source code information. Specifically, `DILineInfo::Source` field encodes full file source code, if available. BTF only stores source code for selected lines of the file, not a complete source file. Moreover, stored lines are not guaranteed to be sorted in a specific order.

  To avoid reconstruction of a file source code from a set of available lines, this commit adds `LineSource` field instead.

- `Symbolize` class is modified to use `BTFContext` instead of `DWARFContext` when DWARF sections are not available but BTF sections are present in the object file. (`Symbolize` is instantiated by `llvm-objdump`).

- Integration and unit tests.

Note, that DWARF has a notion of "instruction sequence".
DWARF implementation of `DIContext::getLineInfoForAddress()` provides
inexact responses if exact address information is not available but
address falls within "instruction sequence" with some known line
information (see `DWARFDebugLine::LineTable::findRowInSeq()`).

BTF does not provide instruction sequence groupings, thus
`getLineInfoForAddress()` queries only return exact matches.
This does not seem to be a big issue in practice, but output
of the `llvm-objdump -Sd` might differ slightly when BTF
is used instead of DWARF.

[1] https://www.kernel.org/doc/html/latest/bpf/btf.html

  rG LLVM Github Monorepo



-------------- next part --------------
A non-text attachment was scrubbed...
Name: D149058.518581.patch
Type: text/x-patch
Size: 41295 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230501/8d4563c2/attachment-0001.bin>

More information about the llvm-commits mailing list