[PATCH] D150713: [llvm-debuginfo-analyzer] Support both Reference and Type attrs in single DIE
Scott Linder via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 17 11:03:36 PDT 2023
scott.linder added a comment.
(Cross-posting from the other thread, to keep everything in one place)
In D147270#4348913 <https://reviews.llvm.org/D147270#4348913>, @CarlosAlbertoEnciso wrote:
> In D147270#4346528 <https://reviews.llvm.org/D147270#4346528>, @scott.linder wrote:
>
>> I did a little digging and it seems like there is a faulty assumption in `LVELFReader`:
>>
>> // We are assuming that DW_AT_specification, DW_AT_abstract_origin,
>> // DW_AT_type and DW_AT_extension do not appear at the same time
>> // in the same DIE.
>
> Thanks very much for your analysis and for creating a small reproducible test.
>
>> This seems true for GCC, but not for Clang, at least for the simplest reproducer I could create:
>>
>> $ cat a.cpp
>> struct S {
>> static const int Arr[];
>> };
>> const int S::Arr[] = {
>> 0, 1, 2
>> };
>> $ gcc -g -c a.cpp -o a.o
>> $ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
>> 0x00000053: DW_TAG_variable
>> DW_AT_specification (0x00000028 "Arr")
>> DW_AT_decl_line (4)
>> DW_AT_decl_column (0x0b)
>> $ clang++ -g -c a.cpp -o a.o
>> $ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
>> 0x0000001e: DW_TAG_variable
>> DW_AT_specification (0x0000003e "Arr")
>> DW_AT_type (0x00000068 "const int[3]")
>> DW_AT_location (DW_OP_addr 0x0)
>>
>> It seems like the code could be adapted to track these independently, rather than assume only one is present? I can prepare a patch if that sounds reasonable to you!
>
> I am happy with your proposal and a patch.
>
>> Edit: This may also be a Clang bug. I haven't read the relevant DWARF sections with enough detail to say whether the DWARF itself is valid or not, but either way it seems nice to support it in llvm-debuginfo-analyzer, especially considering how much flexibility the tool already affords in terms of input
>
> I compiled your test case with the latest Clang:
>
> 0x0000001e: DW_TAG_variable
> DW_AT_specification (0x00000031 "Arr")
> DW_AT_type (0x00000052 "const int[3]")
> DW_AT_location (DW_OP_addrx 0x0)
> DW_AT_linkage_name ("_ZN1S3ArrE")
>
> 0x00000031: DW_TAG_member
> DW_AT_name ("Arr")
> DW_AT_type (0x0000003a "const int[]")
> DW_AT_decl_file ("/data/projects/sandbox/pr-62716.cpp")
> DW_AT_decl_line (2)
> DW_AT_external (true)
> DW_AT_declaration (true)
>
> From the DWARF section for DW_AT_specification:
>
> A debugging information entry with a DW_AT_specification attribute does not need to duplicate information provided by the debugging information entry referenced by that specification attribute.
>
> Clang:
>
> - Added an extra entry for DW_AT_type which seems redundant in DIE (0x1e) as by following the DW_AT_specification we can get the type
> - The referenced types are pointing to different DIEs (0x52 and 0x3a).
I think in this case the duplication may be meaningful, or at least legal and intelligible, as the types at the declaration vs the definition are actually distinct, see https://en.cppreference.com/w/cpp/language/array#Arrays_of_unknown_bound:
extern int x[]; // the type of x is "array of unknown bound of int"
int a[] = {1, 2, 3}; // the type of a is "array of 3 int"
Or from the C++ spec directly, i.e. from 6.8.1 [basic.types.general] in https://github.com/cplusplus/draft/releases/tag/n4917 (emphasis mine):
> (6) A class type (such as “class X”) can be incomplete at one point in a translation unit and complete later on;
> the type “class X” is the same type at both points. The declared type of an array object can be an array of
> incomplete class type and therefore incomplete; if the class type is completed later on in the translation unit,
> the array type becomes complete; the array type at those two points is the same type. **The declared type of
> an array object can be an array of unknown bound and therefore be incomplete at one point in a translation
> unit and complete later on; the array types at those two points (“array of unknown bound of T” and “array
> of N T”) are different types.** The type of a pointer to array of unknown bound, or of a type defined by a
> typedef declaration to be an array of unknown bound, cannot be completed.
I also think so long as the DWARF spec does not prohibit it, it makes sense for tooling to support it. Admittedly there is a lot of wiggle-room and ambiguity in much of the DWARF spec, but in this case the phrasing "does not need to duplicate" seems to heavily imply that the duplication //is legal//, just not necessary when the attributes match.
================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVELFReader.cpp:1131
+ bool IsType) {
+ auto Iter = ElementTable.try_emplace(Offset).first;
+ // Update the element and all the references pointing to this element.
----------------
CarlosAlbertoEnciso wrote:
> In the case of a new `offset` the original code created:
>
> ```
> [Offset, LVElementEntry(nullptr, {Element}) ]
> ```
> The new code:
>
> ```
> [Offset, LVElementEntry(nullptr, References{}, Types{}) ]
> ```
> The `Element` parameter is not recorded.
It is handled below now, as the change also removes the `if/else`: we use `try_emplace` to find the entry at `Offset`, creating an empty one if none exists. Then, we always add `Element` to the appropriate set.
So the previously explicitly special-cased `Iter == ElementTable.end()` case is all contained in this first line of the function.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D150713/new/
https://reviews.llvm.org/D150713
More information about the llvm-commits
mailing list