[LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Wangnan (F) wangnan0 at huawei.com
Tue Aug 4 19:05:07 PDT 2015


Send again since llvmdev is moved to llvm-dev at lists.llvm.org

On 2015/8/5 9:58, Wangnan (F) wrote:
>
>
> On 2015/8/4 17:01, Wangnan (F) wrote:
>> For people who in llvmdev:
>>
>> This mail is belong to a thread in linux kernel mailing list, the 
>> first message
>> can be retrived from:
>>
>>  http://lkml.kernel.org/r/55B1535E.8090406@plumgrid.com
>>
>> Our goal is to fild a way to make BPF program get an unique ID for 
>> each type
>> so it can pass the ID to other part of kernel, then we can retrive 
>> the type and
>> decode the structure using DWARF information. Currently we have two 
>> problem
>> needs to solve:
>>
>> 1. Dwarf information generated by BPF backend lost all DW_AT_name field;
>>
>> 2. How to get typeid from local variable? I tried llvm.eh_typeid_for
>>    but it support global variable only.
>>
>> Following is my response to Alexei.
>>
>> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>>
>>
>> [SNIP]
>>
>>> didn't have time to look at it.
>>> from your llvm patches looks like you've got quite experienced
>>> with it already :)
>>>
>>>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>>>> help me
>>>> send them to LLVM if you think my code is correct.
>>>
>>> patch 1:
>>> I don't quite understand the purpose of builtin_dwarf_cfa
>>> returning R11. It's a special register seen inside llvm codegen
>>> only. It doesn't have kernel meaning.
>>>
>>
>> Kernel side verifier allows us to do arithmetic computation using two 
>> local variable
>> address or local variable address and R11. Therefore, we can compute 
>> the location
>> of a local variable using:
>>
>>   mark = &my_var_a - __builtin_frame_address(0);
>>
>> If the stack allocation is fixed (if the location is never reused), 
>> the above 'mark'
>> can be uniquely identify a local variable. That's why I'm interesting 
>> in it. However
>> I'm not sure whether the prerequestion is hold.
>>
>>> patch 2:
>>> do we really need to hack clang?
>>> Can you just define a function that aliases to intrinsic,
>>> like we do for ld_abs/ld_ind ?
>>> void bpf_store_half(void *skb, u64 off, u64 val) 
>>> asm("llvm.bpf.store.half");
>>> then no extra patches necessary.
>>>
>>>> struct my_str {
>>>>          int x;
>>>>          int y;
>>>> };
>>>> struct my_str __str_my_str;
>>>>
>>>> struct my_str2 {
>>>>          int x;
>>>>          int y;
>>>>          int z;
>>>> };
>>>> struct my_str2 __str_my_str2;
>>>>
>>>>          test_func(__builtin_bpf_typeid(&__str_my_str));
>>>> test_func(__builtin_bpf_typeid(&__str_my_str2));
>>>>          mov     r1, 1
>>>>          call    4660
>>>>          mov     r1, 2
>>>>          call    4660
>>>
>>> this part looks good. I think it's usable.
>>>
>>> > 1. llvm.eh_typeid_for can be used on global variables only. So for 
>>> each
>>> > output
>>> >     structure we have to define a global varable.
>>>
>>> why? I think it should work with local pointers as well.
>>>
>>
>> It is defined by LLVM, in lib/CodeGen/Analysis.cpp:
>>
>> /// ExtractTypeInfo - Returns the type info, possibly bitcast, 
>> encoded in V.
>> GlobalValue *llvm::ExtractTypeInfo(Value *V) {
>>   ...
>>   assert((GV || isa<ConstantPointerNull>(V)) &&
>>          "TypeInfo must be a global variable or NULL");   <-- we can 
>> use only constant pointers
>>   return GV;
>> }
>>
>> So from llvm::Intrinsic::eh_typeid_for we can get type of global 
>> variables only.
>>
>> We may need a new intrinsic for that.
>>
>>
>>> > 2. We still need to find a way to connect the fetchd typeid with 
>>> DWARF
>>> > info.
>>> >     Inserting that ID into DWARF may workable?
>>>
>>> hmm, that id should be the same id we're seeing in dwarf, right?
>>
>> There's no 'typeid' field in dwarf originally. I'm still looking for 
>> a way
>> to inject this ID into dwarf infromation.
>>
>>> I think it's used in exception handling which is reusing some of
>>> the dwarf stuff for this, so the must be a way to connect that id
>>> to actual type info. Though last time I looked at EH was
>>> during g++ hacking days. No idea how llvm does it exactly, but
>>> I'm assuming the logic for rtti should be similar.
>>>
>>
>> I'm not sure whether RTTI use dwarf to deduce type information. I 
>> think not,
>> because dwarf infos can be stripped out.
>>
>
> Hi Alexei,
>
> Just found that llvm::Intrinsic::eh_typeid_for is function specific. 
> ID of same type in
> different functions may be different. Here is an example:
>
> static int (*bpf_output_event)(unsigned long, void *buf, int size) =
>         (void *) 0x1234;
>
> struct my_str {
>         int x;
>         int y;
> };
> struct my_str __str_my_str;
>
> struct my_str2 {
>         int x;
>         int y;
>         int z;
> };
> struct my_str2 __str_my_str2;
>
> int func(int *ctx)
> {
>         struct my_str var_a;
>         struct my_str2 var_b;
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a, 
> sizeof(var_a));
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b, 
> sizeof(var_b));
>         return 0;
> }
>
> int func2(int *ctx)
> {
>         struct my_str var_a;
>         struct my_str2 var_b;
>
>         /* change order here */
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b, 
> sizeof(var_b));
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a, 
> sizeof(var_a))
>         return 0;
> }
>
> This program uses __builtin_bpf_typeid(llvm::Intrinsic::eh_typeid_for) 
> in func and func2
> for same two types but in different order. We expect same type get 
> same ID.
>
> Compiled with:
>
>  $ clang -target bpf -S -O2 -c ./test_bpf_typeid.c
>
> The result is:
>
>           .text
>         .globl  func
>         .align  8
> func:                                   # @func
> # BB#0:                                 # %entry
>         mov     r2, r10
>         addi    r2, -8
>         mov     r1, 1
>         mov     r3, 8
>         call    4660
>         mov     r2, r10
>         addi    r2, -24
>         mov     r1, 2
>         mov     r3, 12
>         call    4660
>         mov     r0, 0
>         ret
>
>         .globl  func2
>         .align  8
> func2:                                  # @func2
> # BB#0:                                 # %entry
>         mov     r2, r10
>         addi    r2, -24
>         mov     r1, 1                  <--- we want 2 here.
>         mov     r3, 12
>         call    4660
>         mov     r2, r10
>         addi    r2, -8
>         mov     r1, 2                  <--- we want 1 here.
>         mov     r3, 8
>         call    4660
>         mov     r0, 0
>         ret
>
>         .comm   __str_my_str,8,4        # @__str_my_str
>         .comm   __str_my_str2,12,4      # @__str_my_str2
>
>
> Conclusion: llvm::Intrinsic::eh_typeid_for is not on the right 
> direction...
>
> Thank you.




More information about the llvm-dev mailing list