[PATCH] D150713: [llvm-debuginfo-analyzer] Support both Reference and Type attrs in single DIE
Carlos Alberto Enciso via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu May 18 01:05:09 PDT 2023
CarlosAlbertoEnciso added a comment.
In D150713#4350582 <https://reviews.llvm.org/D150713#4350582>, @scott.linder wrote:
> (Cross-posting from the other thread, to keep everything in one place)
>
> In D147270#4348913 <https://reviews.llvm.org/D147270#4348913>, @CarlosAlbertoEnciso wrote:
>
>> In D147270#4346528 <https://reviews.llvm.org/D147270#4346528>, @scott.linder wrote:
>>
>>> I did a little digging and it seems like there is a faulty assumption in `LVELFReader`:
>>>
>>> // We are assuming that DW_AT_specification, DW_AT_abstract_origin,
>>> // DW_AT_type and DW_AT_extension do not appear at the same time
>>> // in the same DIE.
>>
>> Thanks very much for your analysis and for creating a small reproducible test.
>>
>>> This seems true for GCC, but not for Clang, at least for the simplest reproducer I could create:
>>>
>>> $ cat a.cpp
>>> struct S {
>>> static const int Arr[];
>>> };
>>> const int S::Arr[] = {
>>> 0, 1, 2
>>> };
>>> $ gcc -g -c a.cpp -o a.o
>>> $ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
>>> 0x00000053: DW_TAG_variable
>>> DW_AT_specification (0x00000028 "Arr")
>>> DW_AT_decl_line (4)
>>> DW_AT_decl_column (0x0b)
>>> $ clang++ -g -c a.cpp -o a.o
>>> $ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
>>> 0x0000001e: DW_TAG_variable
>>> DW_AT_specification (0x0000003e "Arr")
>>> DW_AT_type (0x00000068 "const int[3]")
>>> DW_AT_location (DW_OP_addr 0x0)
>>>
>>> It seems like the code could be adapted to track these independently, rather than assume only one is present? I can prepare a patch if that sounds reasonable to you!
>>
>> I am happy with your proposal and a patch.
>>
>>> Edit: This may also be a Clang bug. I haven't read the relevant DWARF sections with enough detail to say whether the DWARF itself is valid or not, but either way it seems nice to support it in llvm-debuginfo-analyzer, especially considering how much flexibility the tool already affords in terms of input
>>
>> I compiled your test case with the latest Clang:
>>
>> 0x0000001e: DW_TAG_variable
>> DW_AT_specification (0x00000031 "Arr")
>> DW_AT_type (0x00000052 "const int[3]")
>> DW_AT_location (DW_OP_addrx 0x0)
>> DW_AT_linkage_name ("_ZN1S3ArrE")
>>
>> 0x00000031: DW_TAG_member
>> DW_AT_name ("Arr")
>> DW_AT_type (0x0000003a "const int[]")
>> DW_AT_decl_file ("/data/projects/sandbox/pr-62716.cpp")
>> DW_AT_decl_line (2)
>> DW_AT_external (true)
>> DW_AT_declaration (true)
>>
>> From the DWARF section for DW_AT_specification:
>>
>> A debugging information entry with a DW_AT_specification attribute does not need to duplicate information provided by the debugging information entry referenced by that specification attribute.
>>
>> Clang:
>>
>> - Added an extra entry for DW_AT_type which seems redundant in DIE (0x1e) as by following the DW_AT_specification we can get the type
>> - The referenced types are pointing to different DIEs (0x52 and 0x3a).
>
> I think in this case the duplication may be meaningful, or at least legal and intelligible, as the types at the declaration vs the definition are actually distinct, see https://en.cppreference.com/w/cpp/language/array#Arrays_of_unknown_bound:
>
> extern int x[]; // the type of x is "array of unknown bound of int"
> int a[] = {1, 2, 3}; // the type of a is "array of 3 int"
>
> Or from the C++ spec directly, i.e. from 6.8.1 [basic.types.general] in https://github.com/cplusplus/draft/releases/tag/n4917 (emphasis mine):
>
>> (6) A class type (such as “class X”) can be incomplete at one point in a translation unit and complete later on;
>> the type “class X” is the same type at both points. The declared type of an array object can be an array of
>> incomplete class type and therefore incomplete; if the class type is completed later on in the translation unit,
>> the array type becomes complete; the array type at those two points is the same type. **The declared type of
>> an array object can be an array of unknown bound and therefore be incomplete at one point in a translation
>> unit and complete later on; the array types at those two points (“array of unknown bound of T” and “array
>> of N T”) are different types.** The type of a pointer to array of unknown bound, or of a type defined by a
>> typedef declaration to be an array of unknown bound, cannot be completed.
>
> I also think so long as the DWARF spec does not prohibit it, it makes sense for tooling to support it. Admittedly there is a lot of wiggle-room and ambiguity in much of the DWARF spec, but in this case the phrasing "does not need to duplicate" seems to heavily imply that the duplication //is legal//, just not necessary when the attributes match.
Thanks very much for the great links and explanations. The generation of both types now makes sense.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D150713/new/
https://reviews.llvm.org/D150713
More information about the llvm-commits
mailing list