[cfe-dev] Adding clang extension support for BPF CO-RE relocation support

Y Song via cfe-dev cfe-dev at lists.llvm.org
Mon Dec 20 16:00:08 PST 2021


This is related to bpf CO-RE (compile once, run everythere) feature.
The feature has been implemented and merged in Clang. But per John's
suggestion, it is still good to give an explicit reasoning why this
Clang extension should be implemented. This can serve as a reference
point if people in the future wants to understand the reasoning or
touch the implementation and need to discuss. The format below
follows the suggestion in https://clang.llvm.org/get_involved.html.

Evidence of a significant user community
========================================

The CO-RE feature is to address the issue where the same bpf
program can run across different kernel versions. Note that
kernel internal data structures may change between different
kernel versions. A bpf program targetting one specific kernel
internal data structures often won't work for another kernel.

Before CO-RE, the general approach is bcc ([1]) where
the bpf program is recompiled for *each* kernel. This
incurs large binary size, significant run-time cost
and won't work in many environments (embedded system, container with
limited resource etc.).

CO-RE is proposed to address the above issue. Initial CO-RE patch
permits LLVM to generate relocations for struct/union member field access
and array index. But later as more use cases come up, CO-RE is
enhanced with relocations for field/type existence, type size, bitfield
handling and enum value etc. CO-RE permits the bpf program being compiled once
and then the ELF binary is processed by host bpfloader or host kernel with
properly adjusting kernel data structure accesses in the code based on
relocation information.

CO-RE has been implemented in LLVM and kernel ([2]) and currently
the feature is used virutally by every bpf developer. A few public
posts ([3], [4], [5], [6] and [7]) are added here for reference.

A specific need to reside within the Clang tree
===============================================

The CO-RE related builtin's and attributes are processed by clang
frontend. The relocation
information is preserved in IR and eventually the relocation is
generated by bpf target.
So CO-RE features are an integral part of the compiler so it is the best to have
feature within the Clang tree.

A specification
===============

The CO-RE feature introduced a few clang extensions include:
  . __builtin_preserve_access_index (initially added, [8])
  . __builtin_preserve_field_info (added later, [9])
  . __attribute__((preserve_access_index)) (added later, [10])
  . __builtin_preserve_type_info (added later, [11])
  . __builtin_preserve_enum_value (added later, [11])

All the above builtin's and attributes are used to record relocations.
The following
are detailed specification:
  . __builtin_preserve_access_index
    defined as
      type __builtin_preserve_access_index(type arg)
    Any record member and array index accesses in the argument will
    have relocations generated.
  . __builtin_preserve_field_info
    defined as
      uint32_t __builtin_preserve_field_info(field_access, flag);
    Depending on flag, the relocation is generated for (1) field size,
(2) whether
    the field exists or not, (3) field signedness, or (4) certain bitfield
    info. Specifically for the field existence case, if the field does not exist
    in the actual host, the bpfloader will resolve the above builtin as return 0
    to indicate the field doesn't exist so bpf verifier will skip this branch.
  . __builtin_preserve_type_info
    defined as
      uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
    Record a relocation for whether the "type" exists or not, or the
"type" size,
    depending on the "flag".
  . __builtin_preserve_enum_value
    defined as
      uint64_t __builtin_preserve_enum_value(*(<enum_type>
*)<enum_value>, flag);
    Record a relocation for whether the "enum_value" (represented a enum name)
    exists or not, or the enum value for the enum name, depending on "flag".
  . __attribute__((preserve_access_index))
    Currently this attribute can be applied to record. If a record has this
    attribute, then any field access for this struct will generate a relocation.

The above builtin's and the attribute are the backbone of CO-RE feature.
Please refer to [3] for more detailed explanation.

Representation within the appropriate governing organization
============================================================

N/A

A long-term support plan
========================

The feature will be supported for ever.

A high-quality implementation
=============================

All the above builtin's and attributes are reviewed properly before merging.

A test suite
============

All new features are accompanied with necessary test cases.

References:
===========
  [1]  https://github.com/iovisor/bcc
  [2]  https://github.com/torvalds/linux/blob/master/tools/lib/bpf/relo_core.c
  [3]  https://nakryiko.com/posts/bpf-core-reference-guide/
  [4]  https://www.brendangregg.com/blog/2020-11-04/bpf-co-re-btf-libbpf.html
  [5]  https://blogs.oracle.com/linux/post/bpf-application-development-and-libbpf
  [6]  https://pingcap.com/blog/why-we-switched-from-bcc-to-libbpf-for-linux-bpf-performance-analysis
  [7]  https://android.googlesource.com/platform/external/libbpf/
  [8]  https://reviews.llvm.org/D67734
  [9]  https://reviews.llvm.org/D67980
  [10] https://reviews.llvm.org/D69759
  [11] https://reviews.llvm.org/D83242


More information about the cfe-dev mailing list