[Lldb-commits] [PATCH] D134849: [LLDB][NativePDB] Fix struct layout when it has anonymous unions.

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Thu Oct 6 07:30:19 PDT 2022

labath added inline comments.

Comment at: lldb/source/Plugins/SymbolFile/NativePDB/UdtRecordCompleter.cpp:331
+                             clang::DeclContext *parent_decl_ctx) {
+  static lldb::user_id_t anonymous_id = LLDB_INVALID_UID - 1;
+  clang::FieldDecl *field_decl = nullptr;
Now multiple symbol files can race when accessing this variable (and this also introduces a strange interaction between two supposedly-independent symbol files). Better make this member variable of the parent symbol file, or something like that.

Comment at: lldb/source/Plugins/SymbolFile/NativePDB/UdtRecordCompleter.cpp:392-443
+  // For anonymous unions in a struct, msvc generated pdb doesn't have the
+  // entity for that union. So, we need to construct anonymous union and struct
+  // based on field offsets. The final AST is likely not matching the exact
+  // original AST, but the memory layout is preseved.
+  // The end offset to a field/struct/union that ends at the offset.
+  std::map<uint64_t, Field*> end_offset_map;
This could use a high-level explanation of the algorithm. Like, I know it tries to stuff the fields into structs and unions, but I wasn't able to figure out how it does that, and what are the operating invariants.

The thing I like about this algorithm, is that the most complicated part (the thing I highlighted) is basically just playing with numbers and it is independent of any complex objects and state. If this part is separated out into a separate function, then it would be perfectly suitable for unit testing, and the unit tests could also serve as documentation/examples of the kinds of transformations that the algorithm does.

Comment at: lldb/source/Plugins/SymbolFile/NativePDB/UdtRecordCompleter.h:54
+  struct Field {
+    enum Kind { FIELD, STRUCT, UNION } kind;
+    // Following are only used for field.
according to <https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly>, these should be called `Field, Struct, Union`

Comment at: lldb/source/Plugins/SymbolFile/NativePDB/UdtRecordCompleter.h:64
+    uint64_t base_offset;
+    llvm::SmallVector<FieldSP, 1> fields;
Can we drop the `SP` part? I think that the owning references (I guess that's this field) could be just plain Field instances (unique_ptr<Field> at worst), and the rest could just be plain pointers and references.

Comment at: lldb/source/Plugins/SymbolFile/NativePDB/UdtRecordCompleter.h:92
+  // llvm::SmallVector<llvm::codeview::DataMemberRecord, 1> m_fields;
+  uint64_t start_offset = UINT64_MAX;
+  bool m_is_struct;
m_start_offset ?

Comment at: lldb/test/Shell/SymbolFile/NativePDB/class_layout.cpp:53-59
+// CHECK-NEXT: | |-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable pod trivial literal
+// CHECK-NEXT: | | |-DefaultConstructor exists trivial needs_implicit
+// CHECK-NEXT: | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
+// CHECK-NEXT: | | |-MoveConstructor exists simple trivial needs_implicit
+// CHECK-NEXT: | | |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param
+// CHECK-NEXT: | | |-MoveAssignment exists simple trivial needs_implicit
+// CHECK-NEXT: | | `-Destructor simple irrelevant trivial needs_implicit
I think we should exclude all of these DefinitionData fields from the test expectation, as they are largely irrelevant to the test (and they also obfuscate it very well). Otherwise, people will have to update these whenever a new field gets added.

I don't know whether it contains the thing you wanted to test (as I don't know what that is), but the `type lookup C` output will contain the general structure of the type, and it will be a lot more readable.

  rG LLVM Github Monorepo



More information about the lldb-commits mailing list