[lldb-dev] DWARFASTParserClang and DW_TAG_typedef for anonymous structs
Luke Drummond via lldb-dev
lldb-dev at lists.llvm.org
Thu Mar 10 10:20:06 PST 2016
Hi Greg
First of all thanks for taking the time to help out with this.
On 10/03/16 00:18, Greg Clayton wrote:
> So we ran into a problem where we had anonymous structs in modules. They have no name, so we had no way to say "module A, please give me a struct named... nothing in the namespace 'foo'". Obviously this doesn't work, so we always try to make sure a typedef doesn't come from a module first, by asking us to get the typedef from the DWO file:
>
> type_sp = ParseTypeFromDWO(die, log);
>
> If this fails, it just means we have the typedef in hand. If I compile your example I end up with:
>
> 0x0000000b: TAG_compile_unit [1] *
> AT_producer( "Apple LLVM version 8.0.0 (clang-800.0.5.3)" )
> AT_language( DW_LANG_C99 )
> AT_name( "main.c" )
> AT_stmt_list( 0x00000000 )
> AT_comp_dir( "/tmp" )
> AT_low_pc( 0x0000000100000f60 )
> AT_high_pc( 0x0000000100000fb0 )
>
> 0x0000002e: TAG_subprogram [2] *
> AT_low_pc( 0x0000000100000f60 )
> AT_high_pc( 0x0000000100000f85 )
> AT_frame_base( rbp )
> AT_name( "myfunc" )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 6 )
> AT_prototyped( 0x01 )
> AT_external( 0x01 )
>
> 0x00000049: TAG_formal_parameter [3]
> AT_location( fbreg -8 )
> AT_name( "s" )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 6 )
> AT_type( {0x0000008c} ( my_untagged_struct* ) )
>
> 0x00000057: NULL
>
> 0x00000058: TAG_subprogram [4] *
> AT_low_pc( 0x0000000100000f90 )
> AT_high_pc( 0x0000000100000fb0 )
> AT_frame_base( rbp )
> AT_name( "main" )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 12 )
> AT_type( {0x00000085} ( int ) )
> AT_external( 0x01 )
>
> 0x00000076: TAG_variable [5]
> AT_location( fbreg -16 )
> AT_name( "s" )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 14 )
> AT_type( {0x00000091} ( my_untagged_struct ) )
>
> 0x00000084: NULL
>
> 0x00000085: TAG_base_type [6]
> AT_name( "int" )
> AT_encoding( DW_ATE_signed )
> AT_byte_size( 0x04 )
>
> 0x0000008c: TAG_pointer_type [7]
> AT_type( {0x00000091} ( my_untagged_struct ) )
>
> 0x00000091: TAG_typedef [8]
> AT_type( {0x0000009c} ( struct ) )
> AT_name( "my_untagged_struct" )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 4 )
>
> 0x0000009c: TAG_structure_type [9] *
> AT_byte_size( 0x08 )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 1 )
>
> 0x000000a0: TAG_member [10]
> AT_name( "i" )
> AT_type( {0x00000085} ( int ) )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 2 )
> AT_data_member_location( +0 )
>
> 0x000000ae: TAG_member [10]
> AT_name( "f" )
> AT_type( {0x000000bd} ( float ) )
> AT_decl_file( "/private/tmp/main.c" )
> AT_decl_line( 3 )
> AT_data_member_location( +4 )
>
> 0x000000bc: NULL
>
> 0x000000bd: TAG_base_type [6]
> AT_name( "float" )
> AT_encoding( DW_ATE_float )
> AT_byte_size( 0x04 )
>
> 0x000000c4: NULL
>
>
> Note that the typedef is at 0x00000091, and it is a typedef to 0x0000009c. Also note that the DWARF DIE at 0x0000009c is a complete definition as it has children describing its members and 0x0000009c doesn't have a DW_AT_declaration(1) attribute. Is this how your DWARF looks for your stuff? The DWARF you had looked like:
>
> 0x0000005c: DW_TAG_typedef [6]
> DW_AT_name( "my_untagged_struct" )
> DW_AT_decl_file("/home/luke/main.cpp")
> DW_AT_decl_line(4)
> DW_AT_type({0x0000002d})
>
>
> What did the type at 0x0000002d look like? Similar to 0x0000009c in my DWARF I presume?
In the case of C89/C99, yes, but regrettably when you compile my example
as C++ or use __attribute__((overloadable)) the DWARF does not include
the DW_AT_name for the typedef in the formal parameter[0] of myfunc
COMPILE_UNIT<header overall offset = 0x00000000>:
< 0><0x0000000b> DW_TAG_compile_unit
DW_AT_producer "GNU C++ 4.8.4
-mtune=generic -march=x86-64 -g -fstack-protector"
DW_AT_language DW_LANG_C_plus_plus
DW_AT_name "main.cpp"
DW_AT_comp_dir "/tmp"
DW_AT_low_pc 0x004004ed
DW_AT_high_pc <offset-from-lowpc>60
DW_AT_stmt_list 0x00000000
LOCAL_SYMBOLS:
< 1><0x0000002d> DW_TAG_structure_type
DW_AT_byte_size 0x00000008
DW_AT_decl_file 0x00000001 /tmp/main.cpp
DW_AT_decl_line 0x00000001
DW_AT_linkage_name "18my_untagged_struct"
DW_AT_sibling <0x0000004e>
< 2><0x00000039> DW_TAG_member
DW_AT_name "i"
DW_AT_decl_file 0x00000001
/tmp/main.cpp
DW_AT_decl_line 0x00000002
DW_AT_type <0x0000004e>
DW_AT_data_member_location 0
< 2><0x00000043> DW_TAG_member
DW_AT_name "f"
DW_AT_decl_file 0x00000001
/tmp/main.cpp
DW_AT_decl_line 0x00000003
DW_AT_type <0x00000055>
DW_AT_data_member_location 4
< 1><0x0000004e> DW_TAG_base_type
DW_AT_byte_size 0x00000004
DW_AT_encoding DW_ATE_signed
DW_AT_name "int"
< 1><0x00000055> DW_TAG_base_type
DW_AT_byte_size 0x00000004
DW_AT_encoding DW_ATE_float
DW_AT_name "float"
< 1><0x0000005c> DW_TAG_typedef
DW_AT_name "my_untagged_struct"
DW_AT_decl_file 0x00000001 /tmp/main.cpp
DW_AT_decl_line 0x00000004
DW_AT_type <0x0000002d>
< 1><0x00000067> DW_TAG_subprogram
DW_AT_external yes(1)
DW_AT_name "myfunc"
DW_AT_decl_file 0x00000001 /tmp/main.cpp
DW_AT_decl_line 0x00000006
DW_AT_linkage_name
"_Z6myfuncP18my_untagged_struct"
DW_AT_low_pc 0x004004ed
DW_AT_high_pc <offset-from-lowpc>33
DW_AT_frame_base len 0x0001: 9c:
DW_OP_call_frame_cfa
DW_AT_GNU_all_call_sites yes(1)
DW_AT_sibling <0x00000095>
< 2><0x00000088> DW_TAG_formal_parameter
DW_AT_name "s"
DW_AT_decl_file 0x00000001
/tmp/main.cpp
DW_AT_decl_line 0x00000006
DW_AT_type <0x00000095>
DW_AT_location len 0x0002: 9168:
DW_OP_fbreg -24
< 1><0x00000095> DW_TAG_pointer_type
DW_AT_byte_size 0x00000008
DW_AT_type <0x0000005c>
< 1><0x0000009b> DW_TAG_subprogram
DW_AT_external yes(1)
DW_AT_name "main"
DW_AT_decl_file 0x00000001 /tmp/main.cpp
DW_AT_decl_line 0x0000000c
DW_AT_type <0x0000004e>
DW_AT_low_pc 0x0040050e
DW_AT_high_pc <offset-from-lowpc>27
DW_AT_frame_base len 0x0001: 9c:
DW_OP_call_frame_cfa
DW_AT_GNU_all_tail_call_sitesyes(1)
< 2><0x000000b8> DW_TAG_lexical_block
DW_AT_low_pc 0x00400516
DW_AT_high_pc <offset-from-lowpc>17
< 3><0x000000c9> DW_TAG_variable
DW_AT_name "s"
DW_AT_decl_file 0x00000001
/tmp/main.cpp
DW_AT_decl_line 0x0000000e
DW_AT_type <0x0000005c>
DW_AT_location len 0x0002: 9160:
DW_OP_fbreg -32
>
> The DWARFASTParserClang class is responsible for making up a clang type in the clang::ASTContext for this typedef. What will happen in the code where the flow falls through is the we will make a lldb_private::Type that says "I am a typedef to type whose user ID is 0x0000002d (in your example)". A NULL pointer should not be returned from the DWARFASTParserClang::ParseTypeFromDWARF() function. If it is, please step through and figure out why. I compiled your example and did the following:
>
>
> % lldb a.out
> (lldb) b main
> (lldb) r
> Process 89808 launched: '/private/tmp/a.out' (x86_64)
> Process 89808 stopped
> * thread #1: tid = 0xf7473, 0x0000000100000fa3 a.out main + 19, stop reason = breakpoint 1.1, queue = com.apple.main-thread
> frame #0: 0x0000000100000fa3 a.out main + 19 at main.c:15
> 12 int main()
> 13 {
> 14 my_untagged_struct s;
> -> 15 myfunc(&s);
> 16 return 0;
> 17 }
> (lldb) p myfunc(&s)
> (lldb)
>
> So I was able to call this function. Are you not able to call it?
I tried compiling with standard C99, and as you note, this works fine;
however, C++ fails:
$ lldb a.out -o 'b 15' -o 'process launch'
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) b 15
Breakpoint 1: where = a.out`main + 8 at main.cpp:15, address =
0x0000000000400516
(lldb) process launch
Process 18718 stopped
* thread #1: tid = 18718, 0x0000000000400516 a.out`main + 8 at
main.cpp:15, name = 'a.out', stop reason = breakpoint 1.1
frame #0: 0x0000000000400516 a.out`main + 8 at main.cpp:15
Process 18718 launched: '/tmp/a.out' (x86_64)
(lldb) expr myfunc(&s)
error: Couldn't lookup symbols:
myfunc($_0*)
(lldb)
>
> Likewise if I step into this function I can see the variable:
>
> (lldb) s
> (lldb) fr var s
> (my_untagged_struct *) s = 0x00007fff5fbff8d0
> (lldb) fr var *s
> (my_untagged_struct) *s = (i = 0, f = 3.1400001)
This does indeed seem to work
(lldb) s
Process 18769 stopped
* thread #1: tid = 18769, 0x00000000004004f5
a.out`myfunc(s=0x00007fffffffe2b0) + 8 at main.cpp:8, name = 'a.out',
stop reason = step in
frame #0: 0x00000000004004f5 a.out`myfunc(s=0x00007fffffffe2b0) + 8 at
main.cpp:8
(lldb) fr var s
(my_untagged_struct *) s = 0x00007fffffffe2b0
(lldb) fr var *s
(my_untagged_struct) *s = (i = -7264, f =
0.0000000000000000000000000000000000000000459163468)
(lldb)
>
> So to sum up: when we parse the DW_TAG_typedef in DWARFASTParserClang::ParseTypeFromDWARF(), we should return a valid TypeSP that contains a valid pointer. If that isn't happening, that is a bug. Feel free to send me the example binary and I can figure things out if you have any trouble. I wrote all of this code so I am quite familiar with it.
>
I've confirmed you're absolutely right about returning a non-null TypeSP
after fallthrough in DWARFASTParserClang::ParseTypeFromDWARF, but it
seems that with an empty name it doesn't allow clang to resolve the
type, failing to locate mangled function as the typename is wrong
(_Z6myfuncP3$_0).
A colleague took a look at this today, and as a quick sanity test, threw
together this hack:
--- a/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -553,6 +553,19 @@ DWARFASTParserClang::ParseTypeFromDWARF (const
SymbolContext& sc,
}
}
+ {
+ uint32_t list_size = type_list->GetSize();
+ for (uint32_t i = 0; i < list_size; ++i)
+ {
+ TypeSP t = type_list->GetTypeAtIndex(i);
+ if (t->IsTypedef())
+ {
+ type_name_const_str = t->GetName();
+ type_name_cstr = t->GetName().AsCString();
+ }
+ }
+ }
It seems to fix our problem here and expression evaluation works again
for the presented case, but unfortunately, a few other tests break,
which is a little frustrating. If you have time to take another look at
why this might be the case, it'd be very much appreciated.
I've attached an example Mac binary of this issue in action built with
an older Apple clang++, (it's simply the test above) but the result is
the same for me on Linux with upstream clang++ and g++5.3, so I don't
think the age of the compiler is a problem here.
Thanks again
Luke
> Greg Clayton
>
>
>> On Mar 9, 2016, at 3:54 PM, luke Drummond via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>>
>> Hi All
>>
>> I'm hoping that someone might be able to give me some direction
>> regarding `Type` resolution from DWARF informationfor functions taking
>> anonymous structs hidden behind a typedef
>>
>> e.g.
>>
>> ```
>> typedef struct {
>> int i;
>> float f;
>> } my_untagged_struct;
>>
>> void __attribute__((noinline)) myfunc(my_untagged_struct *s)
>> {
>> s->i = 0;
>> s->f = 3.14f;
>> }
>>
>> int main()
>> {
>> my_untagged_struct s;
>> myfunc(&s);
>> return 0;
>> }
>>
>> ```
>>
>> I [recently reported a
>> bug](https://llvm.org/bugs/show_bug.cgi?id=26790) relating to the
>> clang expression evaluator no longer being able to resolve calls to
>> functions with arguments to typedefed anonymous structs, after a cleanup
>> to the expression parsing code.
>> I was perfectly wrong in my assumptions about the cause of the bug, and
>> after some more digging, I think I've tracked it down to a section of
>> code in `DWARFASTParserClang::ParseTypeFromDWARF`.
>>
>>
>> (DWARFASTParserClang::ParseTypeFromDwarf:254)
>> ```
>> switch (tag)
>> {
>> case DW_TAG_typedef:
>> // Try to parse a typedef from the DWO file first as modules
>> // can contain typedef'ed structures that have no names like:
>> //
>> // typedef struct { int a; } Foo;
>> //
>> // In this case we will have a structure with no name and a
>> // typedef named "Foo" that points to this unnamed structure.
>> // The name in the typedef is the only identifier for the
>> struct, // so always try to get typedefs from DWO files if possible.
>> //
>> // The type_sp returned will be empty if the typedef doesn't
>> exist // in a DWO file, so it is cheap to call this function just to
>> check. //
>> // If we don't do this we end up creating a TypeSP that says
>> this // is a typedef to type 0x123 (the DW_AT_type value would be 0x123
>> // in the DW_TAG_typedef), and this is the unnamed structure
>> type. // We will have a hard time tracking down an unnammed structure
>> // type in the module DWO file, so we make sure we don't get
>> into // this situation by always resolving typedefs from the DWO file.
>> type_sp = ParseTypeFromDWO(die, log);
>> if (type_sp)
>> return type_sp;
>> LLVM_FALLTHROUGH
>> ```
>>
>> In my case, the type information for the typedef is included within the
>> main executable's DWARF rather than an external .dwo file (snippet from
>> the DWARF included the end of this message), and therefore the `case`
>> for `DW_TAG_typedef` falls through as `ParseTypeFromDWO` returns a NULL
>> value.
>>
>>
>> As this is code I'm not familiar with, I'd appreciate if any one on the
>> list was able to give some guidance as to the best way to resolve this
>> issue, so that `ClangExpressionDeclMap::FindExternalVisibleDecls` can
>> correctly resolve calls to functions taking typedef names to anonymous
>> structs. I'm happy to take a whack at implementing this feature, but
>> I'm a bit stuck as to how to resolve this type given the current DIE
>> object.
>>
>> Any help or guidance on where to start with this would be really
>> helpful.
>>
>> All the best
>>
>> Luke
>>
>>
>>
>>
>> --------
>> This is a snippet from the output of llvm-dwarfdump on the above code
>> example.
>>
>> `g++ -g main.cpp && llvm-dwarfdump a.out | grep DW_TAG_typedef -A 35`
>> --------
>>
>> 0x0000005c: DW_TAG_typedef [6]
>> DW_AT_name [DW_FORM_strp]
>> ( .debug_str[0x00000069] = "my_untagged_struct") DW_AT_decl_file
>> [DW_FORM_data1] ("/home/luke/main.cpp") DW_AT_decl_line
>> [DW_FORM_data1] (4) DW_AT_type [DW_FORM_ref4] (cu +
>> 0x002d => {0x0000002d})
>>
>> 0x00000067: DW_TAG_subprogram [7] *
>> DW_AT_external [DW_FORM_flag_present] (true)
>> DW_AT_name [DW_FORM_strp]
>> ( .debug_str[0x00000006] = "myfunc") DW_AT_decl_file
>> [DW_FORM_data1] ("/home/luke/main.cpp") DW_AT_decl_line
>> [DW_FORM_data1] (6) DW_AT_linkage_name [DW_FORM_strp]
>> ( .debug_str[0x0000005d] = "_Z6myfuncP18my_untagged_struct")
>> DW_AT_low_pc [DW_FORM_addr] (0x0000000000400566) DW_AT_high_pc
>> [DW_FORM_data8] (0x0000000000000026) DW_AT_frame_base
>> [DW_FORM_exprloc] (<0x1> 9c ) DW_AT_Unknown_2117
>> [DW_FORM_flag_present] (true) DW_AT_sibling
>> [DW_FORM_ref4] (cu + 0x0095 => {0x00000095})
>>
>> 0x00000088: DW_TAG_formal_parameter [8]
>> DW_AT_name [DW_FORM_string] ("s")
>> DW_AT_decl_file [DW_FORM_data1]
>> ("/home/luke/main.cpp") DW_AT_decl_line [DW_FORM_data1] (6)
>> DW_AT_type [DW_FORM_ref4] (cu + 0x0095 =>
>> {0x00000095}) DW_AT_location [DW_FORM_exprloc] (<0x2> 91 68 )
>>
>> 0x00000094: NULL
>>
>> 0x00000095: DW_TAG_pointer_type [9]
>> DW_AT_byte_size [DW_FORM_data1] (0x08)
>> DW_AT_type [DW_FORM_ref4] (cu + 0x005c =>
>> {0x0000005c})
>>
>> _______________________________________________
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mac-expr-anon-struct-example.tar.gz
Type: application/gzip
Size: 20992 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160310/7e4ec715/attachment-0001.bin>
More information about the lldb-dev
mailing list