[llvm-dev] put "str" in __attribute__((annotate("str"))) to dwarf

Y Song via llvm-dev llvm-dev at lists.llvm.org
Mon Jun 14 14:33:45 PDT 2021


On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com> wrote:
>>
>> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov
>> <alexei.starovoitov at gmail.com> wrote:
>> >
>> > On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote:
>> > > On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov
>> > > <alexei.starovoitov at gmail.com> wrote:
>> > > >
>> > > > On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com> wrote:
>> > > > >
>> > > > >> >
>> > > > >> >
>> > > > >> > Any suggestions/preferences for the spelling, Aaron?
>> > > > >>
>> > > > >> I don't know this domain particularly well, so takes these suggestions
>> > > > >> with a giant grain of salt:
>> > > > >>
>> > > > >> If the concept is specific to DWARF and you don't think it'll need to
>> > > > >> extend into other debug formats, you could go with `dwarf_annotate`.
>> > > > >> If it's not really a DWARF thing but is more about B[P|T]F, then
>> > > > >> `btf_annotate`  or `bpf_annotate` could work, but those may be a bit
>> > > > >> mysterious to folks outside of the domain. If it's a generic debug
>> > > > >> info concept, probably `debug_info_annotate` or something.
>> > > > >
>> > > > >
>> > > > > Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly
>> > > > >
>> > > >
>> > > > A bit more bike shedding colors...
>> > > >
>> > > > The __rcu and __user annations might be used by the clang itself eventually.
>> > > > Currently the "sparse" tool is doing this analysis and warns users
>> > > > when __rcu pointer is incorrectly accessed in the kernel C code.
>> > > > If clang can do that directly that could be a huge selling point
>> > > > for folks to switch from gcc to clang for kernel builds.
>> > > > The front-end would treat such annotations as arbitrary string, but
>> > > > special "building-linux-kernel-pass" would interpret the semantical context.
>> > >
>> > > Are __rcu and __user annotations notionally distinct things from bpf
>> > > (and perhaps each other as well)? Distinct enough that it would make
>> > > sense to use a different attribute name for user source for each need?
>> > > I suspect the answer is yes given that the existing annotations have
>> > > their own names which are distinct, but I don't know this domain
>> > > enough to be sure.
>> >
>> > __rcu and __user don't overlap. __rcu is not a single annotation though.
>> > It's a combination of annotations in pointers, functions, macros.
>> > Some functions have:
>> > __acquires(rcu)
>> > another function might have:
>> > __acquires(rcu_bh)
>> > There are several flavors of the RCU in the kernel.
>> > So single __attribute__((rcu_annotate("foo"))) won't work even within RCU scope.
>> > But if we do:
>> > struct foo {
>> >   void * __attribute__((tag("ptr.rcu_bh")) ptr;
>> > };
>> > int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... }
>> > int baz(int) __attribute__((tag("releases.rcu_bh")) { ... }
>> > int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... }
>> > ...
>> > The clang pass can parse these strings and correlate one tag to another.
>> > RCU flavors come and go, so clang cannot hard code the names.
>>
>> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case?
>>
>> David, in one of your early emails, you mentioned:
>>
>> ===
>> Arguably it can/could be a generic debug info or dwarf thing, but for
>> now we don't have any use for it other than to squirrel info along to
>> BTF/BPF so I'm on the fence about which prefix to use exactly
>> ===
>>
>> and suggests since it might be used in the future for non-bpf things,
>> maybe the name could be a little more generic then bpf-specific.
>>
>> Do you have any suggestions on what name to pick?
>
>
> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you.

Sounds good. I will use "bpf_tag" as the starting point now.
Also, it is possible "bpf_tag" may appear multiple times for the same
function, declaration etc.

For example,
  #define __bpf_tag(s) __attribute__((bpf_tag(s)))
  int g __bpf_tag("str1") __bpf_tag("str2");
Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag.

How do you want the above to be represented in dwarf?

My current scheme is to put all bpf_tag's in a string, separated by ",".
This will make things simpler. So the final output will be
     DWARF_AT_LLVM_bpf_tag  "str1,str2"
I may need to do a discussion with the kernel folks to use a different
delimiter than ",", but we still represent all tags with ONE string.

But alternatively, it could be represented as a list of strings like
   DWARF_AT_LLVM_bpf_tag
             "str1"
             "str2"
is similar to DWARF_AT_location.

The first internal representation
   DWARF_AT_LLVM_bpf_tag  "str1,str2"
should be easier for IR/bitcode read/write and dwarf parsing.

What do you think?


More information about the llvm-dev mailing list