[PATCH] D125784: [llvm-dva] 09 - CodeView Reader

Pavel Samolysov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 10 00:43:30 PDT 2022


psamolysov added a comment.

Good job! Thank you for the patch series and the usable tool!

There are numerous patterns such as `Map.insert(std::make_pair(Key,ValuePair(Element1, Element2))`. I prefer using `emplace` and `std::picewise_construct` (example: https://reviews.llvm.org/D125783#inline-1215434) to eliminate unnecessary copying even though not so much information is actually copied in this case (the key and two elements, usually pointers, this is about 24 bytes but it can be more if an `Element` in the value is a struct.)

All the other comments are inlined.



================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h:171
+  void addInlineeLines(LVScope *Scope, LVLines &Lines) {
+    LVLines *InlineeLines = new LVLines();
+    *InlineeLines = std::move(Lines);
----------------
An `InlineeLines` is a pointer to an in-heap allocated object. I've found that some lines are deleted in the `LVBinaryReader::includeInlineeLines` function but I'm not sure that all the allocated ones. Although, what if the function is never called after a call to the `addInlineeLines` one? Could you check that all the in-heap allocated and not deleted during the class' instance lifetime objects are deleted in the destructor?


================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewReader.h:154
+
+  Error collectInlineeInfo(DebugInlineeLinesSubsectionRef &Lines,
+                           const llvm::pdb::SymbolGroup *SG = nullptr);
----------------
I'm not sure, but the `DebugInlineeLinesSubsectionRef` type has the suffix `Ref`. Does it mean the same intent that the `StringRef` class, for example, has: lightweight objects of the class should be passed by value? 


================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewReader.h:184
+  Error createLines(const FixedStreamArray<LineNumberEntry> &LineNumbers,
+                    LVAddress addendum, uint32_t Segment, uint32_t Begin,
+                    uint32_t Size, uint32_t NameIndex,
----------------



================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.h:113
+  void printBinaryBlockWithRelocs(StringRef Label, ArrayRef<uint8_t> Block) {
+    StringRef SBlock(reinterpret_cast<const char *>(Block.data()),
+                     Block.size());
----------------
The code creates a temporary `StringRef` and does nothing more. Because this is the latest patch in the series, this code should be the final variant. What is this `StringRef` required for?


================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.h:244
+// Visitor for CodeView types and symbols to populate elements.
+class LVLogicalVisitor {
+  LVCodeViewReader *Reader = nullptr;
----------------
The class should be marked as `final` because there is no classes derived from. Or the destructor should be marked as `virtual` if some derived classes might be defined later.


================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.h:300
+
+  // Input source in the case of type server o precompiled header.
+  void setInput(std::shared_ptr<llvm::pdb::InputFile> TypeServer) {
----------------



================
Comment at: llvm/include/llvm/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.h:350
+  std::string getCompileUnitName() { return CompileUnitName; }
+  void setCompileUnitName(std::string Name) { CompileUnitName = Name; }
+
----------------



================
Comment at: llvm/lib/DebugInfo/CodeView/TypeRecordHelpers.cpp:153
+    return getUdtSize<ClassRecord>(std::move(CVT));
+    break;
+  case LF_UNION:
----------------
The `break` after `return` makes no sense.


================
Comment at: llvm/lib/DebugInfo/CodeView/TypeRecordHelpers.cpp:156
+    return getUdtSize<UnionRecord>(std::move(CVT));
+    break;
+  default:
----------------
The `break` after `return` makes no sense.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Core/LVElement.cpp:111
+    StringRef InnerComponent;
+    std::tie(OuterComponent, InnerComponent) = getInnerComponent(Name);
+    setName(InnerComponent);
----------------
`OuterComponent` has no usages here. The `std::ignore` placeholder can be used instead: https://en.cppreference.com/w/cpp/utility/tuple/ignore


================
Comment at: llvm/lib/DebugInfo/LogicalView/Core/LVSupport.cpp:61
+
+LexicalIndexes getAllLexicalIndices(StringRef Name) {
+  if (Name.empty())
----------------
Just a nit about the naming. The type has the name `...Indexes` while the function has the name `...Indices`. I believe the same word should be used for the plural form of `Index`.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Core/LVSupport.cpp:61
+
+LexicalIndexes getAllLexicalIndices(StringRef Name) {
+  if (Name.empty())
----------------
psamolysov wrote:
> Just a nit about the naming. The type has the name `...Indexes` while the function has the name `...Indices`. I believe the same word should be used for the plural form of `Index`.
The function is used in the `LVSupport.cpp` compilation unit, so it can be marked as `static`.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Core/LVSupport.cpp:112-117
+  LLVM_DEBUG({
+    LexicalEntry Entry = Indexes.back();
+    llvm::dbgs() << formatv(
+        "'{0}:{1}', '{2}'\n", Entry.first, Entry.second,
+        Name.substr(Entry.first, Entry.second - Entry.first + 1));
+  });
----------------
Just a nit, to eliminate the code duplication, a lambda with the body of these lines could be extracted and reused here and on line 96.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Core/LVSupport.cpp:154
+
+std::string llvm::logicalview::getScopedName(LVStringRefs Components,
+                                             StringRef BaseName) {
----------------
`LVStringRefs` is defined as `std::vector<StringRef>`, this object may be heavy and it could make sense to pass it by reference.


================
Comment at: llvm/lib/DebugInfo/LogicalView/LVReaderHandler.cpp:157
+    std::vector<StringRef> ExecutableExtensions = {"exe", "dll"};
+    for (const StringRef &Extension : ExecutableExtensions) {
+      std::string ExecutableImage = searchForExe(Filename, Extension);
----------------
It makes no sense to pass `StringRef` by reference.


================
Comment at: llvm/lib/DebugInfo/LogicalView/LVReaderHandler.cpp:170
+    std::vector<StringRef> ObjectExtensions = {"o", "obj", "lib"};
+    for (const StringRef &Extension : ObjectExtensions) {
+      std::string ObjectImage = searchForObj(Filename, Extension);
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp:228
+    dbgs() << "\nSections Information:\n";
+    for (LVSections::value_type &Entry : Sections) {
+      LVSectionIndex SectionIndex = Entry.first;
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp:237
+             << format_decimal(SectionIndex, 3)
+             //             << " Comdat: " << Section.isComdat()
+             << " Name: " << *SectionNameOrErr << "\n"
----------------
It could be better to remove the commented code from the patch.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp:248
+    dbgs() << "\nObject Section Information:\n";
+    for (const LVSectionAddresses::value_type &Entry : SectionAddresses)
+      dbgs() << "[" << hexValue(Entry.first) << ":"
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:228
+                                          StringRef &Name) {
+  assert(SectionContents.data() < RelocPtr &&
+         RelocPtr < SectionContents.data() + SectionContents.size() &&
----------------
What if `RelocPtr` is exactly equal to `SectionContents.data()`?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:242
+  Name = Name.empty() ? Element->getName() : Name;
+  auto find = [&](const char *String) -> bool {
+    return StringRef::npos != Name.find(String);
----------------
`StringRef` could be caught by value.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:245
+  };
+  auto starts = [&](const char *Pattern) -> bool {
+    return Name.startswith(Pattern);
----------------
`StringRef` could be caught by value.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:310
+Error LVCodeViewReader::createLines(
+    const FixedStreamArray<LineNumberEntry> &LineNumbers, LVAddress addendum,
+    uint32_t Segment, uint32_t Begin, uint32_t Size, uint32_t NameIndex,
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:493
+  // ".debug$P" section.
+  COFFObjectFile &Obj = *dyn_cast<COFFObjectFile>(&BinaryObj);
+  for (const SectionRef &Section : Obj.sections()) {
----------------
If we are sure that `BinaryObj` is an instance of the `COFFObjectFile` here, could we just use `cast<>` instead of `dyn_cast<>`?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:777
+  // know all the required information.
+  for (StringRef &SymbolName : SymbolNames) {
+    LLVM_DEBUG({
----------------
`StringRef` is already a "reference".


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:790
+    // Find the associate symbol table information.
+    LVSymbolTableEntry SymbolTableEntry = getSymbolTableEntry(SymbolName);
+    LVScope *Function = SymbolTableEntry.Scope;
----------------
As I get, `LVSymbolTableEntry`'s cost is about 18-24 bytes, so it makes sense to not copy the entire entry but has a constant reference to it. A constant reference makes the lifecycle of the entry as long as the current scope, so no problem should be here.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:840
+
+  const COFFObjectFile *Object = dyn_cast<COFFObjectFile>(&Obj);
+
----------------
I believe a `cast<>` is enough if you are sure the `Obj` is an instance of the `COFFObjectFile` here.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewReader.cpp:1127
+    for (const LineColumnEntry &Block : Lines)
+      if (Error Err = createLines(Block.LineNumbers, /*addendum=*/0, Segment,
+                                  Begin, Size, Block.NameIndex, &SG))
----------------
If the corresponding change for the `createLines` function's signature is applied.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:118
+namespace logicalview {
+
+// Keeps the type indexes with line information.
----------------
The definitions below are used in the `LVCodeViewVisitor.cpp` file only. What if to put them in the anonymous namespace?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:221
+
+  using LookupSet = std::set<StringRef>;
+  LookupSet DeductedScopes;
----------------
There is no based on comparison operations with `DeducedScopes`, `UnresolvedScopes` and `IdentifiedNamespaces`, the sets are used for searching only. It looks like the `std::unordered_set` with O(1) inserting/searching is enough (hm, maybe except the `UnresolvedScopes` variable because there is an iteration over the set and the order might be important).


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:287
+    if (Strings.find(TI) != Strings.end())
+      String = std::get<1>(Strings[TI]);
+    return String;
----------------
Here we are looking up in the map again. `Strings.find(TI)` has already returned an iterator, could the iterator be reused?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:294
+    if (Strings.find(TI) != Strings.end())
+      Index = std::get<0>(Strings[TI]);
+    return Index;
----------------
Here we are looking up in the map again. `Strings.find(TI)` has already returned an iterator, could the iterator be reused?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:352
+  if (Target.find(TI) != Target.end()) {
+    Element = Target[TI].second;
+    if (Element)
----------------
Here and few times below we are looking up in the map again and again. `Target.find(TI)` has already returned an iterator, could the iterator be reused?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:373
+void LVStringRecords::addFilenames() {
+  for (const StringIds::value_type &Entry : Strings) {
+    StringRef Name = std::get<1>(Entry.second);
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:382
+void LVStringRecords::addFilenames(LVScopeCompileUnit *Scope) {
+  for (StringIds::value_type &Entry : Strings)
+    if (!std::get<2>(Entry.second))
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:1578-1580
+      StringRef InnerComponent;
+      StringRef OuterComponent;
+      std::tie(OuterComponent, InnerComponent) = getInnerComponent(Proc.Name);
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:1599
+      // We can have a LF_FUNC_ID, LF_PROCEDURE or LF_MFUNCTION.
+      CVFunctionType = Ids.tryGetType(TIFunctionType);
+      if (!getRecordType()) {
----------------
This line is not needed, `CVFunctionType` will be set in exactly this value in the `getRecordType` lambda which is called on the next line.


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:2250
+  // At this point the types recording the qualifiers do not have a
+  // scope parent. They must assigned to the current compile unit.
+  LVScopeCompileUnit *CompileUnit = Reader->getCompileUnit();
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:2329
+  // At this point the types recording the qualifiers do not have a
+  // scope parent. They must assigned to the current compile unit.
+  LVScopeCompileUnit *CompileUnit = Reader->getCompileUnit();
----------------



================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:2332
+
+  // Orden for the different modifiers:
+  // <restrict> <pointer, Reference, ValueReference> <const, volatile>
----------------
Order?


================
Comment at: llvm/lib/DebugInfo/LogicalView/Readers/LVCodeViewVisitor.cpp:2703-2705
+        StringRef InnerComponent;
+        StringRef OuterComponent;
+        std::tie(OuterComponent, InnerComponent) =
----------------



================
Comment at: llvm/test/tools/llvm-dva/COFF/01-coff-compare-logical-elements.test:41
+; ONE-NEXT:  [004]                   {Variable} 'CONSTANT' -> 'const int'
+; ONE-NEXT: +[004]                   {TypeAlias} 'INTEGER' -> 'int'
+
----------------
Just a question:  does `+` mean the element was added in the Target and omitted in the Reference?


================
Comment at: llvm/unittests/DebugInfo/LogicalView/CodeViewReaderTest.cpp:74
+  EXPECT_EQ(CompileUnit->getBaseAddress(), 0);
+  EXPECT_EQ(CompileUnit->getProducer().startswith("clang"), 1);
+  EXPECT_STREQ(CompileUnit->getName().data(), "test.cpp");
----------------



================
Comment at: llvm/unittests/DebugInfo/LogicalView/CodeViewReaderTest.cpp:140
+  EXPECT_EQ(CompileUnit->getBaseAddress(), 0);
+  EXPECT_EQ(CompileUnit->getProducer().startswith("Microsoft"), 1);
+  EXPECT_STREQ(CompileUnit->getName().data(), "test.cpp");
----------------



================
Comment at: llvm/unittests/DebugInfo/LogicalView/CodeViewReaderTest.cpp:206
+  EXPECT_EQ(CompileUnit->getBaseAddress(), 0);
+  EXPECT_EQ(CompileUnit->getProducer().startswith("Microsoft"), 1);
+  EXPECT_STREQ(CompileUnit->getName().data(), "test.cpp");
----------------



================
Comment at: llvm/unittests/DebugInfo/LogicalView/CodeViewReaderTest.cpp:411
+      {"movl", &LVElement::getIsLine},
+      {"movl", &LVElement::getIsLine}};
+  LVReader *Reader = createReader(ReaderHandler, InputsDir, CodeViewClang);
----------------
Line duplication?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125784/new/

https://reviews.llvm.org/D125784



More information about the llvm-commits mailing list