[llvm-dev] [cfe-dev] put "str" in __attribute__((annotate("str"))) to dwarf
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Mon Jun 14 18:44:18 PDT 2021
On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com>
wrote:
>
>
> On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org>
> wrote:
>
> On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
>
> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com> wrote:
>
>
> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov
> <alexei.starovoitov at gmail.com> wrote:
>
>
> On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote:
>
> On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov
> <alexei.starovoitov at gmail.com> wrote:
>
>
> On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
>
> Any suggestions/preferences for the spelling, Aaron?
>
>
> I don't know this domain particularly well, so takes these suggestions
> with a giant grain of salt:
>
> If the concept is specific to DWARF and you don't think it'll need to
> extend into other debug formats, you could go with `dwarf_annotate`.
> If it's not really a DWARF thing but is more about B[P|T]F, then
> `btf_annotate` or `bpf_annotate` could work, but those may be a bit
> mysterious to folks outside of the domain. If it's a generic debug
> info concept, probably `debug_info_annotate` or something.
>
>
>
> Arguably it can/could be a generic debug info or dwarf thing, but for now
> we don't have any use for it other than to squirrel info along to BTF/BPF
> so I'm on the fence about which prefix to use exactly
>
>
> A bit more bike shedding colors...
>
> The __rcu and __user annations might be used by the clang itself
> eventually.
> Currently the "sparse" tool is doing this analysis and warns users
> when __rcu pointer is incorrectly accessed in the kernel C code.
> If clang can do that directly that could be a huge selling point
> for folks to switch from gcc to clang for kernel builds.
> The front-end would treat such annotations as arbitrary string, but
> special "building-linux-kernel-pass" would interpret the semantical
> context.
>
>
> Are __rcu and __user annotations notionally distinct things from bpf
> (and perhaps each other as well)? Distinct enough that it would make
> sense to use a different attribute name for user source for each need?
> I suspect the answer is yes given that the existing annotations have
> their own names which are distinct, but I don't know this domain
> enough to be sure.
>
>
> __rcu and __user don't overlap. __rcu is not a single annotation though.
> It's a combination of annotations in pointers, functions, macros.
> Some functions have:
> __acquires(rcu)
> another function might have:
> __acquires(rcu_bh)
> There are several flavors of the RCU in the kernel.
> So single __attribute__((rcu_annotate("foo"))) won't work even within RCU
> scope.
> But if we do:
> struct foo {
> void * __attribute__((tag("ptr.rcu_bh")) ptr;
> };
> int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... }
> int baz(int) __attribute__((tag("releases.rcu_bh")) { ... }
> int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... }
> ...
> The clang pass can parse these strings and correlate one tag to another.
> RCU flavors come and go, so clang cannot hard code the names.
>
>
> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case?
>
> David, in one of your early emails, you mentioned:
>
> ===
> Arguably it can/could be a generic debug info or dwarf thing, but for
> now we don't have any use for it other than to squirrel info along to
> BTF/BPF so I'm on the fence about which prefix to use exactly
> ===
>
> and suggests since it might be used in the future for non-bpf things,
> maybe the name could be a little more generic then bpf-specific.
>
> Do you have any suggestions on what name to pick?
>
>
>
> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you.
>
>
>
> The more generic the better IMO. And, the less the need to parse string
> literals the better.
>
> Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.:
>
> ```
> #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__)))
> struct foo {
> void * BPF_TAG("ptr","rcu","bh") ptr;
> };
> #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__)
> int bar(int) BPF_RCU_TAG("acquires","bh") { ... }
> int baz(int) BPF_RCU_TAG("releases","bh") { ... }
> int qux(int) BPF_RCU_TAG("acquires","sched") { ... }
> ```
>
Unless Paul & Adrian, etc chime in in agreement of a more general name,
like 'debuginfo', I'm inclined to avoid that/go with something bpf specific
until there's a broader use case/proposal, something we might be able
to/want to encourage GCC to support too. Otherwise we're taking a pretty
broad attribute name & choosing its behavior when we don't necessarily have
a lot of leverage if GCC ends up using that name for something else.
& as for separate strings - maybe, but I'm not sure what that'll look like
in the resulting DWARF, what sort of form would you propose using to encode
that? (same question below \/)
>
> Sounds good. I will use "bpf_tag" as the starting point now.
> Also, it is possible "bpf_tag" may appear multiple times for the same
> function, declaration etc.
>
> For example,
> #define __bpf_tag(s) __attribute__((bpf_tag(s)))
> int g __bpf_tag("str1") __bpf_tag("str2");
> Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag.
>
> How do you want the above to be represented in dwarf?
>
> My current scheme is to put all bpf_tag's in a string, separated by ",".
> This will make things simpler. So the final output will be
> DWARF_AT_LLVM_bpf_tag "str1,str2"
> I may need to do a discussion with the kernel folks to use a different
> delimiter than ",", but we still represent all tags with ONE string.
>
> But alternatively, it could be represented as a list of strings like
> DWARF_AT_LLVM_bpf_tag
> "str1"
> "str2"
> is similar to DWARF_AT_location.
>
>
What DWARF form were you thinking of using for this? There isn't a built in
form that provides encoding for multiple delimited/separated strings that I
know of.
>
> The first internal representation
> DWARF_AT_LLVM_bpf_tag "str1,str2"
> should be easier for IR/bitcode read/write and dwarf parsing.
>
> What do you think?
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210614/451bf5a1/attachment.html>
More information about the llvm-dev
mailing list