[cfe-dev] support __attribute__((weak)) for non-primitive types

Andrii Nakryiko via cfe-dev cfe-dev at lists.llvm.org
Mon Aug 30 15:57:05 PDT 2021


On Mon, Aug 30, 2021 at 3:21 PM James Y Knight <jyknight at google.com> wrote:
>
> On Mon, Aug 30, 2021 at 1:37 PM Andrii Nakryiko <andriin at fb.com> wrote:
>>
>> 2) Tell users that they can either include either the autogenerated "vmlinux.h", OR set the pragma and include normal kernel headers. But not both. AFAICT, the vmlinux.h file is not really version-independent -- except that it uses the "preserve_access_index" pragma in order to allow field accesses to be adjusted at load-time to match the actual struct layout. Doesn't putting the same pragma around the normal kernel headers work, too?
>>
>> That's what we are telling users right now. And that's what users are complaining about. vmlinux.h is sufficiently version-independent with BPF CO-RE, unless some fields are removed/renamed, or you need the very latest new field added in more recent kernel than the one you used to generate vmlinux.h.
>
> I understand the problem with using only vmlinux.h. What is the complaint users have with using only the kernel-internal headers, though?

Users generally find it more complicated to use kernel headers instead
of vmlinux.h, which I suspect is #1 reason for them to do that in the
first place.

But I was also under the impression that you can't use internal kernel
headers easily, as I've seen users were just copy/pasting type
definitions as is from kernel sources instead.

Furthermore, vmlinux.h contains *all* kernel types, even if they were
defined inside .c files, so with vmlinux.h you get those as well,
while there is no header you can even theoretically include to get
them without just copy/pasting source code as is. BPF is used to do
some deep and involved kernel internals tracing, so it's quite common
to need to use such internal types from the BPF program (just in case
you thought I'm making this use case up).


>
> Using the type definitions from normal kernel headers will have a result which is just as version-independent, so long as you use `#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)`, right?

Yes, but as described above only in a subset of desired scenarios. The
only generic solution seems to be __bpf_weaktype and it should handle
all known cases. And once we have that, we'll probably find some new
and creative ways to utilize it.

>
>> 3) Automatically extract #defines from kernel headers (e.g. using -dD to print all macro-expansions) to create a new header with those values. (Or defines and inline functions. Or everything but struct definitions. Or something along those lines). Have people include that, instead.
>>
>> ...Although, IIUC, you want use the kernel-internal headers e.g. "include/linux/socket.h", not the stable "uapi" headers, e.g. "include/uapi/linux/socket.h". But can't those defines change arbitrarily? Doesn't that ruin the version-independence aspect? So, maybe you want:
>>
>> Yes, users often need to work with types defined in kernel-internal headers. You are right that internal #defines are "unstable", so users are usually interested in stable #defines in UAPI headers. The problem is that vmlinux.h is all or nothing approach, currently. If you use vmlinux.h (because you need internal types, not just UAPI ones), you can't include UAPI headers (for those #defines) anymore due to type conflicts. That's exactly what Yonghong is trying to solve here.
>
> So, is the desire here to enable mixing uapi headers + vmlinux.h, or mix kernel internal headers + vmlinux.h, or it doesn't really matter, either one would be fine, whichever can be made to work?

I didn't keep track of which headers exactly users want to mix-in with
vmlinux.h, so I suspect there are both cases. E.g., for UAPI people
needed some of the FS constants (include/uapi/linux/fcntl.h,
include/uapi/linux/fs.h, etc), while it's also common to need
include/net/sock.h with its definitions like `#define sk_dport
   __sk_common.skc_dport`. It's just two examples I could quickly
recall without searching the history.

>
>>
>> 4) Enhance BPF/CO:RE to support defines as compile-time-unknown/runtime-constants [at least some useful subset of them]. Then they too can be fixed up at BPF load-time, instead of hoping that the values never change.
>>
>> #defines are not relocatable at runtime by any means, because they are just arbitrary text substituted by compiler. Once Clang compiled BPF program into object file, there is nothing you can do about #defines used. I don't see any way for BPF CO-RE to do "relocatable" #defines.
>
> In this proposal, I was imagining that the unstable values would be exposed as globals, instead, and treated as constants at load time. (Similar to what you're doing with CONFIG_* values, where they are unknown globals at compile-time, and constants at load time.)

See #define sk_dport __sk_common.skc_dport example above, where it's
not just constants. There are also a bunch of static inline helper
functions for networking needs.

>
>> Though, the example you give is "TCP" (IPPROTO_TCP?) which is in the uapi. Maybe you do need only the stable uapi defines? In which case,
>>
>> 5) Create a set of modified uapi headers for use with bpf, which omits struct definitions, and instead has #include vmlinux.h at the top instead. Distribute with libbpf-dev, perhaps?
>>
>> That's not really a solution, rather a maintenance nightmare. It's never a good idea to maintain a separate copy of something that's developed by thousands of developers, and have hope to keep up with all the changes. If that's the only possible solution, then going back to status quo (making this user's problem and telling them no to use vmlinux.h at all) is a less painful way.
>
> I meant to say that this set of modified headers could be generated, not that you'd want to manually maintain a separate fork of it.
>

There are some experiments going on to use DWARF's capability to
record macros as text (not enabled by default, I think). We'd then add
them into BTF and use that to generate exactly the same #defines in
vmlinux.h or somewhere else. I think that's not the right way to go,
though, because it at least doubles the size of BTF data due to
recording tons of strings most of which are completely irrelevant and
will never be used (but we can't tell which one is which). And that
still doesn't cover the static inline functions. And we'll probably
find more limitations with that approach as we try to apply it to
real-world applications. Something like __bpf_weaktype seems like a
more optimal solution.


More information about the cfe-dev mailing list