[cfe-dev] Adding clang extension support for BPF CO-RE relocation support

Aaron Ballman via cfe-dev cfe-dev at lists.llvm.org
Mon Jan 3 11:40:13 PST 2022


On Mon, Dec 20, 2021 at 7:00 PM Y Song <ys114321 at gmail.com> wrote:
>
> This is related to bpf CO-RE (compile once, run everythere) feature.
> The feature has been implemented and merged in Clang. But per John's
> suggestion, it is still good to give an explicit reasoning why this
> Clang extension should be implemented. This can serve as a reference
> point if people in the future wants to understand the reasoning or
> touch the implementation and need to discuss. The format below
> follows the suggestion in https://clang.llvm.org/get_involved.html.

Thank you for putting this information together and getting an RFC out!

> Evidence of a significant user community
> ========================================
>
> The CO-RE feature is to address the issue where the same bpf
> program can run across different kernel versions. Note that
> kernel internal data structures may change between different
> kernel versions. A bpf program targetting one specific kernel
> internal data structures often won't work for another kernel.
>
> Before CO-RE, the general approach is bcc ([1]) where
> the bpf program is recompiled for *each* kernel. This
> incurs large binary size, significant run-time cost
> and won't work in many environments (embedded system, container with
> limited resource etc.).
>
> CO-RE is proposed to address the above issue. Initial CO-RE patch
> permits LLVM to generate relocations for struct/union member field access
> and array index. But later as more use cases come up, CO-RE is
> enhanced with relocations for field/type existence, type size, bitfield
> handling and enum value etc. CO-RE permits the bpf program being compiled once
> and then the ELF binary is processed by host bpfloader or host kernel with
> properly adjusting kernel data structure accesses in the code based on
> relocation information.
>
> CO-RE has been implemented in LLVM and kernel ([2]) and currently
> the feature is used virutally by every bpf developer. A few public
> posts ([3], [4], [5], [6] and [7]) are added here for reference.
>
> A specific need to reside within the Clang tree
> ===============================================
>
> The CO-RE related builtin's and attributes are processed by clang
> frontend. The relocation
> information is preserved in IR and eventually the relocation is
> generated by bpf target.
> So CO-RE features are an integral part of the compiler so it is the best to have
> feature within the Clang tree.
>
> A specification
> ===============
>
> The CO-RE feature introduced a few clang extensions include:
>   . __builtin_preserve_access_index (initially added, [8])
>   . __builtin_preserve_field_info (added later, [9])
>   . __attribute__((preserve_access_index)) (added later, [10])
>   . __builtin_preserve_type_info (added later, [11])
>   . __builtin_preserve_enum_value (added later, [11])
>
> All the above builtin's and attributes are used to record relocations.
> The following
> are detailed specification:
>   . __builtin_preserve_access_index
>     defined as
>       type __builtin_preserve_access_index(type arg)
>     Any record member and array index accesses in the argument will
>     have relocations generated.
>   . __builtin_preserve_field_info
>     defined as
>       uint32_t __builtin_preserve_field_info(field_access, flag);
>     Depending on flag, the relocation is generated for (1) field size,
> (2) whether
>     the field exists or not, (3) field signedness, or (4) certain bitfield
>     info. Specifically for the field existence case, if the field does not exist
>     in the actual host, the bpfloader will resolve the above builtin as return 0
>     to indicate the field doesn't exist so bpf verifier will skip this branch.
>   . __builtin_preserve_type_info
>     defined as
>       uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
>     Record a relocation for whether the "type" exists or not, or the
> "type" size,
>     depending on the "flag".
>   . __builtin_preserve_enum_value
>     defined as
>       uint64_t __builtin_preserve_enum_value(*(<enum_type>
> *)<enum_value>, flag);
>     Record a relocation for whether the "enum_value" (represented a enum name)
>     exists or not, or the enum value for the enum name, depending on "flag".
>   . __attribute__((preserve_access_index))
>     Currently this attribute can be applied to record. If a record has this
>     attribute, then any field access for this struct will generate a relocation.
>
> The above builtin's and the attribute are the backbone of CO-RE feature.
>
> Please refer to [3] for more detailed explanation.
>
> Representation within the appropriate governing organization
> ============================================================
>
> N/A

What organization determines things like the specification you posted
above, or is there no organization behind these efforts?

> A long-term support plan
> ========================
>
> The feature will be supported for ever.

This is not a long term support plan. :-) When we have support needs
in the future, will there be people/a company/a community available to
do that work or is the expectation that once this lands, the Clang
community is responsible for it? (This matters with the above question
about the organization responsible for governing the specification --
can the Clang community do as they please here or do we need to
coordinate with others?)

> A high-quality implementation
> =============================
>
> All the above builtin's and attributes are reviewed properly before merging.

Are there other implementations of CO-RE in the wild that we should be
measuring against?

> A test suite
> ============
>
> All new features are accompanied with necessary test cases.

Are there any external ways we can verify the functionality? (A
conformance suite, some other implementation we can test against, etc)

~Aaron

>
> References:
> ===========
>   [1]  https://github.com/iovisor/bcc
>   [2]  https://github.com/torvalds/linux/blob/master/tools/lib/bpf/relo_core.c
>   [3]  https://nakryiko.com/posts/bpf-core-reference-guide/
>   [4]  https://www.brendangregg.com/blog/2020-11-04/bpf-co-re-btf-libbpf.html
>   [5]  https://blogs.oracle.com/linux/post/bpf-application-development-and-libbpf
>   [6]  https://pingcap.com/blog/why-we-switched-from-bcc-to-libbpf-for-linux-bpf-performance-analysis
>   [7]  https://android.googlesource.com/platform/external/libbpf/
>   [8]  https://reviews.llvm.org/D67734
>   [9]  https://reviews.llvm.org/D67980
>   [10] https://reviews.llvm.org/D69759
>   [11] https://reviews.llvm.org/D83242


More information about the cfe-dev mailing list