[cfe-dev] [llvm-dev] put "str" in __attribute__((annotate("str"))) to dwarf

Adrian Prantl via cfe-dev cfe-dev at lists.llvm.org
Wed Jun 16 09:43:20 PDT 2021



> On Jun 14, 2021, at 6:44 PM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> 
> 
> On Mon, Jun 14, 2021 at 4:54 PM David Rector <davrecthreads at gmail.com <mailto:davrecthreads at gmail.com>> wrote:
> 
> 
>> On Jun 14, 2021, at 5:33 PM, Y Song via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>> 
>> On Mon, Jun 14, 2021 at 1:25 PM David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Jun 14, 2021 at 12:25 PM Y Song <ys114321 at gmail.com <mailto:ys114321 at gmail.com>> wrote:
>>>> 
>>>> On Fri, Jun 11, 2021 at 9:59 AM Alexei Starovoitov
>>>> <alexei.starovoitov at gmail.com <mailto:alexei.starovoitov at gmail.com>> wrote:
>>>>> 
>>>>> On Fri, Jun 11, 2021 at 07:17:32AM -0400, Aaron Ballman wrote:
>>>>>> On Thu, Jun 10, 2021 at 8:47 PM Alexei Starovoitov
>>>>>> <alexei.starovoitov at gmail.com <mailto:alexei.starovoitov at gmail.com>> wrote:
>>>>>>> 
>>>>>>> On Thu, Jun 10, 2021 at 12:42 PM David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Any suggestions/preferences for the spelling, Aaron?
>>>>>>>>> 
>>>>>>>>> I don't know this domain particularly well, so takes these suggestions
>>>>>>>>> with a giant grain of salt:
>>>>>>>>> 
>>>>>>>>> If the concept is specific to DWARF and you don't think it'll need to
>>>>>>>>> extend into other debug formats, you could go with `dwarf_annotate`.
>>>>>>>>> If it's not really a DWARF thing but is more about B[P|T]F, then
>>>>>>>>> `btf_annotate`  or `bpf_annotate` could work, but those may be a bit
>>>>>>>>> mysterious to folks outside of the domain. If it's a generic debug
>>>>>>>>> info concept, probably `debug_info_annotate` or something.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Arguably it can/could be a generic debug info or dwarf thing, but for now we don't have any use for it other than to squirrel info along to BTF/BPF so I'm on the fence about which prefix to use exactly
>>>>>>>> 
>>>>>>> 
>>>>>>> A bit more bike shedding colors...
>>>>>>> 
>>>>>>> The __rcu and __user annations might be used by the clang itself eventually.
>>>>>>> Currently the "sparse" tool is doing this analysis and warns users
>>>>>>> when __rcu pointer is incorrectly accessed in the kernel C code.
>>>>>>> If clang can do that directly that could be a huge selling point
>>>>>>> for folks to switch from gcc to clang for kernel builds.
>>>>>>> The front-end would treat such annotations as arbitrary string, but
>>>>>>> special "building-linux-kernel-pass" would interpret the semantical context.
>>>>>> 
>>>>>> Are __rcu and __user annotations notionally distinct things from bpf
>>>>>> (and perhaps each other as well)? Distinct enough that it would make
>>>>>> sense to use a different attribute name for user source for each need?
>>>>>> I suspect the answer is yes given that the existing annotations have
>>>>>> their own names which are distinct, but I don't know this domain
>>>>>> enough to be sure.
>>>>> 
>>>>> __rcu and __user don't overlap. __rcu is not a single annotation though.
>>>>> It's a combination of annotations in pointers, functions, macros.
>>>>> Some functions have:
>>>>> __acquires(rcu)
>>>>> another function might have:
>>>>> __acquires(rcu_bh)
>>>>> There are several flavors of the RCU in the kernel.
>>>>> So single __attribute__((rcu_annotate("foo"))) won't work even within RCU scope.
>>>>> But if we do:
>>>>> struct foo {
>>>>>  void * __attribute__((tag("ptr.rcu_bh")) ptr;
>>>>> };
>>>>> int bar(int) __attribute__((tag("acquires.rcu_bh")) { ... }
>>>>> int baz(int) __attribute__((tag("releases.rcu_bh")) { ... }
>>>>> int qux(int) __attribute__((tag("acquires.rcu_sched")) { ... }
>>>>> ...
>>>>> The clang pass can parse these strings and correlate one tag to another.
>>>>> RCU flavors come and go, so clang cannot hard code the names.
>>>> 
>>>> Maybe we can name it as "bpf_tag" as it is a "tag" for "bpf" use case?
>>>> 
>>>> David, in one of your early emails, you mentioned:
>>>> 
>>>> ===
>>>> Arguably it can/could be a generic debug info or dwarf thing, but for
>>>> now we don't have any use for it other than to squirrel info along to
>>>> BTF/BPF so I'm on the fence about which prefix to use exactly
>>>> ===
>>>> 
>>>> and suggests since it might be used in the future for non-bpf things,
>>>> maybe the name could be a little more generic then bpf-specific.
>>>> 
>>>> Do you have any suggestions on what name to pick?
>>> 
>>> 
>>> Nah, not especially. bpf_tag sounds OK-ish to me if it suits you.
>> 
> 
> The more generic the better IMO.  And, the less the need to parse string literals the better.  
> 
> Why not simply `__attribute__((debuginfo("arg1", "arg2", ...)))`, e.g.:
> 
> ```
> #define BPF_TAG(...) __attribute__((debuginfo("bpf", __VA_ARGS__)))
> struct foo {
>  void * BPF_TAG("ptr","rcu","bh") ptr;
> };
> #define BPF_RCU_TAG(PFX, ...) BPF(PFX, "rcu", __VA_ARGS__)
> int bar(int) BPF_RCU_TAG("acquires","bh") { ... }
> int baz(int) BPF_RCU_TAG("releases","bh") { ... }
> int qux(int) BPF_RCU_TAG("acquires","sched") { ... }
> ```
> 
> Unless Paul & Adrian, etc chime in in agreement of a more general name, like 'debuginfo', I'm inclined to avoid that/go with something bpf specific until there's a broader use case/proposal, something we might be able to/want to encourage GCC to support too. Otherwise we're taking a pretty broad attribute name & choosing its behavior when we don't necessarily have a lot of leverage if GCC ends up using that name for something else.

There are definitely use-cases for threading a general string attribute through LLVM IR all the way to DWARF. Recently I thought about how to best encode API Swiftification attributes (e.g., https://developer.apple.com/documentation/swift/objective-c_and_c_code_customization/renaming_objective-c_apis_for_swift <https://developer.apple.com/documentation/swift/objective-c_and_c_code_customization/renaming_objective-c_apis_for_swift>) in DWARF. These are Clang attributes put on (Objective-)C type declarations that tell the Swift compiler how to map C and Objective-C types into Swift. The attributes range from nullability of pointers to renaming types to better fit into the Swift world. Having a generic DWARF facility to encode any Clang __attribute__(()) declaration would make this very straightforward to implement.

Maybe this is a good opportunity to design a generic mechanism that works for all attributes? We probably need to add a little more structure than just encoding a single string with the attribute contents to make the encoding more efficient, but we could probably have something generic enough to be useful across many use-cases.

Is there any interest in attempting this or do you prefer with targeted extensions for each kind of attribute?

-- adrian

> 
> & as for separate strings - maybe, but I'm not sure what that'll look like in the resulting DWARF, what sort of form would you propose using to encode that? (same question below \/)
>  
> 
>> Sounds good. I will use "bpf_tag" as the starting point now.
>> Also, it is possible "bpf_tag" may appear multiple times for the same
>> function, declaration etc.
>> 
>> For example,
>>  #define __bpf_tag(s) __attribute__((bpf_tag(s)))
>>  int g __bpf_tag("str1") __bpf_tag("str2");
>> Let us say we introduced a LLVM vendor tag DWARF_AT_LLVM_bpf_tag.
>> 
>> How do you want the above to be represented in dwarf?
>> 
>> My current scheme is to put all bpf_tag's in a string, separated by ",".
>> This will make things simpler. So the final output will be
>>     DWARF_AT_LLVM_bpf_tag  "str1,str2"
>> I may need to do a discussion with the kernel folks to use a different
>> delimiter than ",", but we still represent all tags with ONE string.
>> 
>> But alternatively, it could be represented as a list of strings like
>>   DWARF_AT_LLVM_bpf_tag
>>             "str1"
>>             "str2"
>> is similar to DWARF_AT_location.
> 
> 
> What DWARF form were you thinking of using for this? There isn't a built in form that provides encoding for multiple delimited/separated strings that I know of.
>  
>> 
>> The first internal representation
>>   DWARF_AT_LLVM_bpf_tag  "str1,str2"
>> should be easier for IR/bitcode read/write and dwarf parsing.
>> 
>> What do you think?
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210616/d19179d7/attachment-0001.html>


More information about the cfe-dev mailing list