[cfe-dev] BPF: adding new clang extension bpf_dominating_decl attribute

Y Song via cfe-dev cfe-dev at lists.llvm.org
Thu Jan 27 12:41:57 PST 2022


On Thu, Jan 6, 2022 at 6:02 AM David Rector <davrecthreads at gmail.com> wrote:
>
>
>
> On Jan 6, 2022, at 7:47 AM, Aaron Ballman via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> On Wed, Jan 5, 2022 at 3:31 PM Y Song <ys114321 at gmail.com> wrote:
>
>
> On Mon, Jan 3, 2022 at 12:52 PM Aaron Ballman <aaron at aaronballman.com> wrote:
>
>
> On Mon, Dec 20, 2021 at 7:06 PM Y Song <ys114321 at gmail.com> wrote:
>
>
> This is a request to add a clang extention, more specificly,
> a clang attribute named bpf_dominating_decl. This clang
> extention is intended to be used for bpf target only. Below
> I will explain in detail about this proposed attribute, why
> bpf community needs this, how it will be used and other
> aspects as described in https://clang.llvm.org/get_involved.html.
>
>
> Thank you for this RFC!
>
>
> You are welcome!
>
>
> Evidence of a significant user community
> ========================================
>
> We are proposing a new clang attribute bpf_dominating_decl which
> was implemented in [1]. The feature has also been discussed in
> cfe-dev mailing list ([2]). It intended to solve the
> following use case:
>  - A tool generated vmlinux.h is used for CO-RE (compile once,
>    run everywhere) use cases.
>  - vmlinux.h contains all kernel data structures for a particular config,
>    see [3] and [4] about how it is generated and why it is important.
>  - but vmlinux.h may have type conflicts with other headers
>    user intends to use.
>
> Macros are such an example. Currently CO-RE relocation cannot
> handle macros and macros may be defined in some header files accessible
> to the user. If those header files have type conflict with vmlinux.h,
> users are forced to copy macro definitions. The same for some simple
> static inline functions defined in header files. This issue has been
> discussed before and that is why we proposed this issue. And just last
> week, it is discussed/complained again ([5]) for not able to use
> some non-kernel types with a header file which has some type conflicts
> with vmlinux.h.
>
> If it is accepted, the attribute will be used inside the vmlinux.h and
> it will be used by virtually all bpf developers and it will make bpf devlopers
> more productive by not copying macros, static inline functions or
> non-kernel types.
>
>
> I'm uncomfortable with this attribute. Typically, attributes extend
> rather than redefine the language. e.g., you might add attributes for
> better performance or diagnostic characteristics, but you typically
> should not use an attribute to redefine the basic premises of the
> language.
>
> In this particular case, the attribute is used to tell the compiler to
> ignore type redefinition errors and instead pick a "dominating"
> declaration for the type. While C isn't as type sensitive as C++ is,
> it still has _Generic, __typeof__, and other tricks that can expose
> type system shenanigans like this in surprising ways. Given that type
> size information is critical for many things in C (memcpy, memcmp,
> pointer arithmetic with offsetof, etc), I'm uncomfortable with the
> security aspects of the likely type confusion stemming from this being
> so novel in C.
>
>
> To limit the potential impact. As RFC suggested, we can limit the
> impact only for CO-RE relocatable types. bpf developers are already
> aware and know how to use properly builtin's for CO-RE relocatable
> types and the types we are targeting are also CO-RE relocatable types.
>
>
> Limiting this to just the target and just for specific types will
> certainly help, but doesn't really eliminate the fact that this
> attribute is definitely not very C-like in what it does. As mentioned
> on the code review, we have to do some interesting work to ensure we
> emit the correct diagnostics for conformance to C (or, alternatively,
> document that this target is not a C target, but that leads right back
> to my argument that this is making a new language rather than
> extending an existing one).
>
> For example, type size, in
> https://github.com/torvalds/linux/blob/master/tools/lib/bpf/bpf_core_read.h
> we have the following macro:
>
> #define bpf_core_type_size(type)                                            \
>        __builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_SIZE)
>
> So users can get the type size for a particular kernel. Note that the type
> might have different sizes for different kernels.
>
> For offsetof issue, the bpf_core_read.h provides the following macro:
>
> #define BPF_CORE_READ(src, a, ...) ({                                       \
>        ___type((src), a, ##__VA_ARGS__) __r;                               \
>        BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__);                  \
>        __r;                                                                \
> })
>
> which eventually uses the builtin __builtin_preserve_access_index()
> so bpfloader can adjust the offsetof properly.
>
> So for relocatable types, user won't use typeof or offsetof.
>
>
> Will their use be diagnosed for BPF targets?
>
> Otherwise, programs won't be portable even without bpf_dominating_decl
> attribute.
>
>
> That said, we do have *one* attribute that I consider to be a
> "redefine the language in fundamental ways" feature --
> [[clang::overloadable]] allows you to define overload sets in C, which
> is a distinctly not-C thing to do because of the name mangling
> involved. However, that attribute introduces the C++ semantics into C
> whereas the BPF dominating declaration attribute is introducing wholly
> novel semantics. So I don't really consider [[clang::overloadable]] as
> direct precedent for this.
>
>
> A specific need to reside within the Clang tree
> ===============================================
>
> The proposed attribute will be processed by Clang frontend lex and
> sema and it would be
> best to reside within the Clang tree.
>
>
> Would it be plausible/appropriate to instead run the source code
> through a processing tool which emits modified source code with the
> correct definitions instead of hoping this dominating declaration
> works out? In this case, I think the user will get better diagnostic
>
>
> Theoretically it is possible. We need to have a preprocessor, parse
> the program, including all
> include files, do exactly the https://reviews.llvm.org/D111307 has
> done to ignore
> those duplicated relocatable types and generate a .i file and feed into clang.
> But this duplicates a lot of current clang code and the tool itself cannot
> automatically benefit from future clang improvements. So I think in-tree
> support is the best option, least maintenance burden.
>
>
> I think Clang's architecture as a series of libraries provides
> extensive support for building your own tooling to perform these kinds
> of code transformations. For example, clang-tools-extra has
> clang-change-namespace, clang-include-fixer, clang-reorder-fields, etc
> and they all make use of Clang as a library without needing to modify
> Clang itself. I think it would be reasonable to explore the idea of
> adding such a tool to perform the rewriting for you (it could
> potentially even live in-tree) as you would continue to use the
> existing Clang code and still benefit from future Clang improvements.
>
>
> The general problem, that it is laborious to solve the problem in a tool, and furthermore such a solution would require duplicating lots of existing clang functionality creating a maintenance burden, and that it is thus preferable to upstream the solution, which burdens other users to some extent, recalls the discussion about upstreaming Swift’s APINotes:
> https://lists.llvm.org/pipermail/cfe-dev/2020-October/066944.html
>
> Then and now, I think the best solution in these situations is to encourage/support patches which introduce new handles in ASTConsumer to customize behavior during parsing (rather than always afterward, via HandleTranslationUnit).   This would keep consumer details from leaking upstream, and upstream details from leaking into downstream consumers, lessening maintenance burdens for everyone.
>
> In this case: suppose that wherever clang emits an error, that were changed to a call to an ASTConsumer virtual method which by default emits the error.
>
> I think the tool would then be fairly trivial: instead of issuing a type redefinition error, your tool simply deletes or omit the declaration that caused it.
>
> Would this solve the problem, or am I missing something?

David, thanks a lot for your suggestion. Currently, as in patch
https://reviews.llvm.org/D111307, the detecting and handling
of redefinition are all in the semantic stage. The redefinition
handling is actually not in HandleTranslationUnit(), but rather
in semantic analysis or parsed constructs. So regarding new
ASTConsumer handlers, do you propose to abstract
existing D111307 handling into a virtual handler, populated with bpf
target? But this is still in the semantic stage.

Maybe you could elaborate your possible solution a little more? I
probably missed some of your points.

Thanks!

>
>
> behavior in the places where there *is* type confusion the compiler
> can detect, but the user will at least have an easier time *debugging*
> any problems from this.
>
[...]


More information about the cfe-dev mailing list