[PATCH] D143967: [DebugInfo][BPF] Add 'btf:type_tag' annotation in DWARF

Wed Mar 29 12:38:53 PDT 2023

eddyz87 added a comment.

In D143967#4231517 <https://reviews.llvm.org/D143967#4231517>, @dfaust wrote:

> In D143967#4220233 <https://reviews.llvm.org/D143967#4220233>, @jemarch wrote:
>
>> 
>
> Ok, I understand your point. 
> For example if we have:
>
>   volatile int __tag1 b;
>
> The tag applies to the type (volatile int) and we have 3 types total:
>
> 1. __tag1 volatile int
> 2. volatile int
> 3. int
>
> And in DWARF the tag must be child of the qualifier:
>
>   var(b) -> volatile -> int
>                |
>               __tag1
>   
>    
>
> Doing it the other way would be incorrect - if we instead did this:
>
>   var(b) -> volatile -> int
>                           |
>                        __tag1
>
> We encode a different chain of types:
>
>   var(b) -> volatile -> __tag1 -> int
>
> Which reflects a different set of types than the ones we actually have:
>
> 1. volatile __tag1 int
> 2. __tag1 int
> 3. int
>
> So I guess it is clear, conceptually the tags do belong on the qualifiers.

But that's just an encoding quirk. For the C language "const volatile int *" and "volatile const int *" have the same meaning. In clang AST such qualifiers are represented as bitfields.

If we consider type tag a qualifier, conceptually it would be simpler if ordering would not matter for it as well, so that the following have identical meaning:

- `__tag volatile int`
- `volatile __tag int`

>> And now that I think about this.  In this conceptual situation:
>>
>>   const -> volatile -> annotated (foo) -> annotated (bar) -> pointer-to -> int
>>
>> I guess we are encoding it with several DW_TAG_LLVM_annotation children:
>>
>>   const -> volatile -> pointer-to -> int
>>                            |
>>                  +---------+---------+
>>                  |                   |
>>             anotated (foo)    annotated (bar)
>>
>> Basically accumulating/reducing what are conceptually two different
>> types into one.  Since typedefs are explicitly encoded in the chain as
>> their own node, this situation will only happen when there are several
>> tags specified consecutively in the source code (this is another reason
>> why putting the annotation DIEs as children of the qualifiers is
>> necessary, because otherwise we would be accumulating/reducing
>> annotation across these nodes and we could end with situations where
>> annotations get lost.)
>>
>> When accumulating/reducign tags like above:
>>
>> Does the ordering matter here?  Does it matter?  Should we require
>> keeping the order among the annotation siblings as part of the
>> DW_TAG_LLVM_annotation specification?  Even if we don't, we probably
>> want to behave the same in both compilers.
>
> A quick check suggests order does not matter for DWARF for normal qualifiers:
>
> const volatile int x;
> volatile const int y;
>
> Both compilers emit something like this ('x' and 'y' share type):
>
>   0x0000001e:   DW_TAG_variable
>                   DW_AT_name	("x")
>                   DW_AT_type	(0x00000029 "const volatile int")
>   
>   0x00000029:   DW_TAG_const_type
>                   DW_AT_type	(0x0000002e "volatile int")
>   
>   0x0000002e:   DW_TAG_volatile_type
>                   DW_AT_type	(0x00000033 "int")
>   
>   0x00000033:   DW_TAG_base_type
>                   DW_AT_name	("int")
>   
>   0x00000037:   DW_TAG_variable
>                   DW_AT_name	("y")
>                   DW_AT_type	(0x00000029 "const volatile int")
>
> Interestingly, GCC and LLVM do not behave exactly the same in that in 
> GCC the DWARF types for both variables are
>
>   x, y   -> volatile -> const -> int
>
> while in LLVM (shown above) it is
>
>   x, y -> const -> volatile -> int
>
> So I'm not sure we necessarily need both compilers to be exactly the same.

Unfortunately, for ordering we are also limited by a consumer. In the kernel case the ordering of const/volatile/restrict does not matter, but it does want type tags to come first.

>> In case two identical annotations are applied to the same type, are we
>> deduplicating them?  Should we actually require that?  Even if we don't,
>> we probably want to behave the same in both compilers.
>
> FWIW, if you write something like:
>
>   int __attribute__((btf_type_tag("__c"))) __attribute__((btf_type_tag("__c"))) c;
>
> GCC de-duplicates attributes with the same value, so the tag "__c" will only appear
> once. I don't know if this would be easy to change if we wanted to.

LLVM de-duplicates as well.

AFAIK current type tag use cases do not require to distinguish "const __tag" from "__tag const" and do not need to handle "__tag __tag" separately from "__tag". So, I think we should stick with simpler semantics.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143967/new/

https://reviews.llvm.org/D143967