[PATCH] D143967: [DebugInfo][BPF] Add 'btf:type_tag' annotation in DWARF

Jose E. Marchesi via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 24 10:49:26 PDT 2023


> eddyz87 added a comment.
>
> In D143967#4200331 <https://reviews.llvm.org/D143967#4200331>, @dfaust wrote:
>
>> In D143967#4197288 <https://reviews.llvm.org/D143967#4197288>, @eddyz87 wrote:
>>
>>> ...
>>> I think there are two sides to this:
>>>
>>> - conceptual: is it ok to allow annotations for CVR modifiers?
>>
>> Maybe someone more an expert in DWARF could chime in whether this is problematic?
>> As far as I know it would be "ok" in the sense that it should not
>> break DWARF or cause issues for DWARF consumers which do not know
>> about the tags.
>>
>> My initial reaction was that placing the annotations on CVR
>> modifiers makes less sense conceptually than placing them on the
>> types proper.
>
> I agree conceptually.

That was my initial reaction too.  But see below.

>> But, I suppose it is a question of what precisely the annotated type
>> is. e.g. in `volatile int __attribute__((btf_type_tag("__b"))) b;`
>> is it the case that:
>>
>> - type tag "__b" applies to the type `volatile int` (i.e. cvr quals  "bind more closely" than the tag), or
>> - type tag "__b" and `volatile` both apply to `int` (i.e. cvr quals and type tags are "equal")
>>
>> For all other cases the new "btf:type_tag" format places the
>> annotation as a child of exactly the type that is annotated, so I
>> think the answer determines where the annotation logically belongs
>> in DWARF:
>>
>> - If the tag applies to `volatile int` then it should be a child of the volatile DIE.
>> - If the tag applies to `int` it should be a child of the integer type DIE.

I think that is the main point.  DWARF composes types by chaining DIEs
via DW_AT_type attributes, like in:

   const -> volatile -> pointer-to -> int

Conceptually, there are four different types in that chain, not just
one:

1. const volatile pointer-to int
2. volatile pointer-to int
3. pointer-to int
4. int

Ideally speaking, we would have liked to implement the annotations just
like the qualifiers, so we could have something like:

  const -> volatile -> annotated -> pointer-to -> int

That is what would suite us best conceptually speaking.  But since that
would break DWARF consumers, we decided to go with a parent-child
relationship instead, having:

  const -> volatile -> pointer-to -> int
                           |
                       annotated

Which would be a different situation than, say:

  const -> volatile -> pointer-to -> int
    |
 annotated

So, I think it only makes sense to have the child annotation node in the
node that starts the type to which the annotation refers.  Both
qualifiers and type DIEs are types on that sense.

And now that I think about this.  In this conceptual situation:

  const -> volatile -> annotated (foo) -> annotated (bar) -> pointer-to -> int

I guess we are encoding it with several DW_TAG_LLVM_annotation children:

  const -> volatile -> pointer-to -> int
                           |
                 +---------+---------+
                 |                   |
            anotated (foo)    annotated (bar)

Basically accumulating/reducing what are conceptually two different
types into one.  Since typedefs are explicitly encoded in the chain as
their own node, this situation will only happen when there are several
tags specified consecutively in the source code (this is another reason
why putting the annotation DIEs as children of the qualifiers is
necessary, because otherwise we would be accumulating/reducing
annotation across these nodes and we could end with situations where
annotations get lost.)

When accumulating/reducign tags like above:

Does the ordering matter here?  Does it matter?  Should we require
keeping the order among the annotation siblings as part of the
DW_TAG_LLVM_annotation specification?  Even if we don't, we probably
want to behave the same in both compilers.

In case two identical annotations are applied to the same type, are we
deduplicating them?  Should we actually require that?  Even if we don't,
we probably want to behave the same in both compilers.

(Note that since DW_TAG_typedef is also a type in the type chain, we
 shouldn't worry about 


> I'm aware of three use-cases for "btf_type_tag": "percpu", "user",
> "rcu" marks in kernel code. First is used for per-cpu variables,
> second is used to mark pointers to userspace memory, third to mark the
> data that should be read with RCU lock. For these use-cases there is
> no difference between e.g. `const __user *ptr` and `__user const
> *ptr`. I don't think the difference was ever thought through.
>
>> At the moment I can't say that one is more correct than the other,
>> so I guess I have no real objection placing the annotation on the
>> CVR modifier.
>>
>>> - technical: what is the point where such reordering is easiest to
>>> implement? For LLVM / pahole stack the path of least resistance is
>>> to modify DWARF generation (but again @dblaikie might
>>> disagree). How hard is it to adjust DI generation in GCC in this
>>> way?
>>
>> It is not a difficult change in GCC. Some added complexity but not
>> too much. I have a prototype that seems to work from initial
>> testing.
>
> Same here, it is a small change in clang code, but I don't like that it has to be handled as a special case.
>
>> It is probably simpler than instead modifying the internal
>> transformation to BTF to reorder tags/CVR qualifiers as kernel
>> currently requires.
>>
>> But I don't think ease of implementation should be a factor unless we are sure the format makes sense.
>> We are changing the format because the old one was flawed, let's try
>> to make sure we aren't just replacing it with something flawed in a
>> different way because it is easy.
>
> The main issue is that kernel currently expects type tags before
> modifiers. Even if the kernel code would be modified there is still an
> issue of backwards compatibility. During kernel build BTF is generated
> from DWARF and is embedded in a kernel image, it is done by tool named
> pahole <https://github.com/acmel/dwarves>. If we forgo modifiers
> reordering the embedded BTF would not match format supported by
> kernel.
>
> So, I assume that reordering _has_ to happen somewhere. Either at DWARF generation stage, or in two places:
>
> - compiler BPF backend, when BPF programs are compiled;
> - `pahole`, when BTF is generated from DWARF.
>
> Let me prototype pahole change to see how much of hassle that is, it will take a day or two.
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D143967/new/
>
> https://reviews.llvm.org/D143967


More information about the cfe-commits mailing list