[llvm] [DWARFYAML] Implement debug_names support (PR #79666)

Felipe de Azevedo Piovezan via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 8 11:32:16 PST 2024


================
@@ -691,6 +691,191 @@ Error DWARFYAML::emitDebugStrOffsets(raw_ostream &OS, const Data &DI) {
   return Error::success();
 }
 
+namespace {
+/// Emits the header for a DebugNames section.
+void emitDebugNamesHeader(raw_ostream &OS, bool IsLittleEndian,
+                          uint32_t NameCount, uint32_t AbbrevSize,
+                          uint32_t CombinedSizeOtherParts) {
+  StringRef AugmentationString = "LLVM0700";
+  auto TotalSize = CombinedSizeOtherParts + 5 * sizeof(uint32_t) +
+                   2 * sizeof(uint16_t) + sizeof(NameCount) +
+                   sizeof(AbbrevSize) + AugmentationString.size();
+  writeInteger(uint32_t(TotalSize), OS, IsLittleEndian); // Unit length
+
+  // Everything below is included in total size.
+  writeInteger(uint16_t(5), OS, IsLittleEndian); // Version
+  writeInteger(uint16_t(0), OS, IsLittleEndian); // Padding
+  writeInteger(uint32_t(1), OS, IsLittleEndian); // Compilation Unit count
+  writeInteger(uint32_t(0), OS, IsLittleEndian); // Local Type Unit count
+  writeInteger(uint32_t(0), OS, IsLittleEndian); // Foreign Type Unit count
+  writeInteger(uint32_t(0), OS, IsLittleEndian); // Bucket count
+  writeInteger(NameCount, OS, IsLittleEndian);
+  writeInteger(AbbrevSize, OS, IsLittleEndian);
+  writeInteger(uint32_t(AugmentationString.size()), OS, IsLittleEndian);
+  OS.write(AugmentationString.data(), AugmentationString.size());
+  return;
+}
+
+/// Emits the abbreviations for a DebugNames section.
+std::string
+emitDebugNamesAbbrev(ArrayRef<DWARFYAML::DebugNameAbbreviation> Abbrevs) {
+  std::string Data;
+  llvm::raw_string_ostream OS(Data);
+  for (const auto &Abbrev : Abbrevs) {
+    encodeULEB128(Abbrev.Code, OS);
+    encodeULEB128(Abbrev.Tag, OS);
+    for (auto [Idx, Form] : Abbrev.Indices) {
+      encodeULEB128(Idx, OS);
+      encodeULEB128(Form, OS);
+    }
+    encodeULEB128(0, OS);
+    encodeULEB128(0, OS);
+  }
+  encodeULEB128(0, OS);
+  return Data;
+}
+
+/// Emits a simple CU offsets list for a DebugNames section containing a single
+/// CU at offset 0.
+std::string emitDebugNamesCUOffsets(bool IsLittleEndian) {
+  std::string Data;
+  llvm::raw_string_ostream OS(Data);
+  writeInteger(uint32_t(0), OS, IsLittleEndian);
+  return Data;
+}
+
+/// Emits the "NameTable" for a DebugNames section; according to the spec, it
+/// consists of two arrays: an array of string offsets, followed immediately by
+/// an array of entry offsets. The string offsets are emitted in the order
+/// provided in `Entries`.
+std::string emitDebugNamesNameTable(
+    bool IsLittleEndian,
+    const std::map<uint32_t, std::vector<DWARFYAML::DebugNameEntry>> &Entries,
+    ArrayRef<uint32_t> EntryPoolOffsets) {
+  assert(Entries.size() == EntryPoolOffsets.size());
+
+  std::string Data;
+  llvm::raw_string_ostream OS(Data);
+
+  for (auto Strp : make_first_range(Entries))
+    writeInteger(Strp, OS, IsLittleEndian);
+  for (auto PoolOffset : EntryPoolOffsets)
+    writeInteger(uint32_t(PoolOffset), OS, IsLittleEndian);
+  return Data;
+}
+
+/// Groups entries based on their name (strp) code and returns a sorted map.
+std::map<uint32_t, std::vector<DWARFYAML::DebugNameEntry>>
+groupEntries(ArrayRef<DWARFYAML::DebugNameEntry> Entries) {
+  std::map<uint32_t, std::vector<DWARFYAML::DebugNameEntry>> Ans;
+  for (const auto &Entry : Entries)
+    Ans[Entry.NameStrp].push_back(Entry);
+  return Ans;
+}
+
+/// Finds the abbreviation whose code is AbbrevCode and returns a list
+/// containing the expected size of all non-zero-length forms.
+Expected<SmallVector<uint8_t>>
+getNonZeroDataSizesFor(uint32_t AbbrevCode,
+                       ArrayRef<DWARFYAML::DebugNameAbbreviation> Abbrevs) {
+  const auto *AbbrevIt = find_if(Abbrevs, [&](const auto &Abbrev) {
+    return Abbrev.Code.value == AbbrevCode;
+  });
+  if (AbbrevIt == Abbrevs.end())
+    return createStringError(inconvertibleErrorCode(),
+                             "Did not find an Abbreviation for this code");
----------------
felipepiovezan wrote:

> (possibly even easier)

By definition it can't be easier because the alternative is already one ;) 

> (as a slight aside, I could be persuaded that in this case no alternative value makes sense).

On a more serious note though, this is pretty the key observation here.
Once you start adding certain types of errors, it has ripple effects on the emitter. In this instance in particular, we simply cannot return anything meaningful because we need the guidance of the Forms in order to write the data. If we return "0" or an empty vector here, we have to skip emitting a whole chunk of the debug information.

> one of the purposes of yaml2obj is to be able to emit invalid objects (including DWARF) 

I'm not too sure about the parenthesis there: a lot of DWARF is very tightly coupled together, and a lot of the "invalid data tests" are written with binary dwarf.

As much as I appreciate the desire to test invalid data, I would ask that we land this in its current state because it is the only thing blocking other important work ([accelerator table enhancements](https://github.com/llvm/llvm-project/pull/79932)) and it is also improving the status quo from no testing to a good amount of unittests.

https://github.com/llvm/llvm-project/pull/79666


More information about the llvm-commits mailing list