[PATCH] D49678: [llvm-objcopy] Adding support for compressed DWARF debug sections.

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 8 09:10:45 PDT 2018


jhenderson added inline comments.


================
Comment at: include/llvm/Object/Compressor.h:43
+
+/// Return the gnu style compressed section name.
+Expected<std::string> getDebugSectionName(const StringRef Name,
----------------
This comment implies that the style will always be GNU, but the function takes an "IsGnuStyle" which doesn't make sense. Also, this function should be getDecompressedDebugSectionName, since it's for compressed sections only.


================
Comment at: lib/Object/Compressor.cpp:28
+  if (!Name.startswith(".debug") && !Name.startswith(".zdebug"))
+    return make_error<StringError>(
+        "Invalid Debug Section Name.",
----------------
Please include the passed-in section name in the error message. There's now a `createStringError` function that allows you to format error messages, that might help. You also don't need to use `std::make_error_code` here (and it seems like there's a preference to use llvm::errc rather than std::errc, although I don't really know why).


================
Comment at: lib/Object/Compressor.cpp:38
+bool isCompressable(StringRef Name) {
+  Expected<std::string> NewName = getDebugSectionName(Name, false);
+  if (errorToBool(NewName.takeError()))
----------------
Hmm... looking at this a bit more, I feel like this would be better done by pulling the check at the start of `getDebugSectionName` into a separate `isCompressableSectionName` function, which could be used by both `getDebugSectionName` and instead of this function.


================
Comment at: lib/Object/Compressor.cpp:44
+
+bool isCompressed(StringRef Name, uint64_t Flags) {
+  return Name.startswith(".zdebug") || (Flags & ELF::SHF_COMPRESSED);
----------------
This should probably be a member of `SectionBase`, since it is a property of the section.


================
Comment at: test/tools/llvm-objcopy/zlib-dwarf-gnu-roundtrip.test:1
+# REQUIRES: shell
+# REQUIRES: zlib
----------------
This test doesn't require shell.


================
Comment at: test/tools/llvm-objcopy/zlib-dwarf-gnu-roundtrip.test:1
+# REQUIRES: shell
+# REQUIRES: zlib
----------------
jhenderson wrote:
> This test doesn't require shell.
Tests are usually named after the switch they are testing, so in this case, compress-debug-sections-zlib-gnu.test would probably be the right name. The roundtrip part is unnecessary.

The same goes for the other test.


================
Comment at: test/tools/llvm-objcopy/zlib-dwarf-gnu-roundtrip.test:10
+# RUN: llvm-objcopy --compress-debug-sections=zlib-gnu %t-clean.o %t-zlib-gnu.o
+# RUN: llvm-objcopy --decompress-debug-sections %t-zlib-gnu.o %t-decompressed.o
+
----------------
Use diff to diff %t-decompressed.o with %t-clean.o, since they should be the same.


================
Comment at: test/tools/llvm-objcopy/zlib-dwarf-gnu-roundtrip.test:16-17
+
+# CHECK-ORIGINAL: .debug_str
+# CHECK-ORIGINAL: clang
+#
----------------
There's no need for distinct CHECK-ORIGINAL and CHECK-DECOMPRESSED. Just use the same check-prefix in both.


================
Comment at: test/tools/llvm-objcopy/zlib-dwarf.test:8-12
+# This produces a small shell script that takes an object file, reads out and
+# sorts its debug section names, and then in sorted order it runs llvm-objdump
+# (via xargs) on each debug section while redirecting that output to a text
+# file. We compare the txt files of the original vs zlib vs zlib-gnu
+# round-tripping.
----------------
Why is this sorting necessary? The section order will be the same, won't it? Why do you need more than one debug section for this test? Why is this test so much more complex than zlib-dwarf-gnu-roundtrip.test?


================
Comment at: test/tools/llvm-objcopy/zlib-dwarf.test:26
+# RUN: sh generateObjTxtOutput.sh %t-zlib.o
+# RUN: cmp %t-clean2.o.txt %t-zlib.o.txt
+# RUN: rm -f %t-zlib.o.txt %t-zlib.o
----------------
You can use diff rather than cmp, since diff is built into lit.


================
Comment at: tools/llvm-objcopy/Object.h:269-272
+  virtual const StringRef getContents() const {
+    assert(false && "Section has no Contents.");
+    return "";
+  }
----------------
plotfi wrote:
> jhenderson wrote:
> > plotfi wrote:
> > > jhenderson wrote:
> > > > Don't mark return types as const (except for pointers to const and reference returns): the caller can just ignore the const and copy the value, so it has zero use (indeed some compilers actually warn about this). It's also completely redundant for a StringRef, since StringRefs are implicitly const.
> > > > 
> > > > Should this function just be a pure abstract function if it can't be called?
> > > > 
> > > > I also don't know whether "getContents" means get original contents or get contents at the end. Please rename more appropriately.
> > > What do you mean? I guess the tricky thing here is, it depends on the type of object. I don't see this being useful for anything other than a Section or a OwnedDataSection or a Compressed/Decompressed section. 
> > Right, so should this be a part of the base section interface? What you are saying suggests that it shouldn't be.
> > 
> > Is the purpose of this function to get the original, unmodified section contents, or the contents after they get modified (when the latter is possible)?
> Well so, since it seems we are treating these section objects as immutable then it is the unmodified section contents. At the end of the day I need some kind of interface to get the original section content to compress or decompress. 
Right, so call the function "getOriginalContents" to show that it's the original section contents.


================
Comment at: tools/llvm-objcopy/Object.h:332
+  const StringRef getContents() const override {
+    return StringRef((const char *)Contents.data(), Contents.size());
+  }
----------------
plotfi wrote:
> jhenderson wrote:
> > Why not just return Contents directly, and make this an ArrayRef<uint8_t> return type?
> Decided to return a Expected<ArrayRef<uint8_t>>, if the object does not have any relevant contents then it just returns an error. I think this works out for cases like StringTableSection and SymbolTableSection where im not sure what I would implement for their getSectionContents method. 
There was talk elsewhere of adding an OriginalData ArrayRef member to the SectionBase class. This would resolve these issues nicely, so maybe you should add it.


================
Comment at: tools/llvm-objcopy/Object.h:359
+  bool isGnuStyle() const {
+    return StringRef((const char *)Data.data(), 4).startswith("ZLIB");
+  }
----------------
Rather than force this into a StringRef, just so you can compare the string, why not just do something like:

```
ArrayRef<uint8_t> GnuPrefix = {'Z', 'L', 'I', 'B'};
return std::equals (GnuPrefix.begin(), GnuPrefix.end(), Data.data());
```


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:349
+
+  auto D = object::Decompressor::create(Name, Contents, IsLittle, Is64Bit);
+  if (!D) {
----------------
Don't use auto here. Please refer to the LLVM coding guidelines on when to use auto.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:351-352
+  if (!D) {
+    auto Err = D.takeError();
+    return std::move(Err);
+  }
----------------
Do this in one line.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:356
+  auto DecompressedContents = make_unique<SmallVector<char, 128>>();
+  if (auto Err = D->resizeAndDecompress(*DecompressedContents.get())) {
+    return std::move(Err);
----------------
Don't use auto here.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:410
+      reportError(Config.InputFilename, NewNameOrError.takeError());
+    StringRef NewName = IsGnuStyle ? StringRef(*NewNameOrError) : Section.Name;
+
----------------
Doesn't getDebugSectionName already return the right name for the GnuStyle? You don't need this ternary here, and can just assign `*NewNameOrError` to `NewName` directly.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:413
+    auto BufPtr = reinterpret_cast<const uint8_t *>(CompressedContents->data());
+    auto BufSize = CompressedContents->size();
+    Obj.addSection<CompressedSection>(
----------------
Don't use auto here.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:433
+      reportError(Config.InputFilename, ContentsOrError.takeError());
+    StringRef Contents((const char *)ContentsOrError->data(),
+                       ContentsOrError->size());
----------------
I still think you should do this conversion as late as possible (i.e. at the call site to the library). You should also avoid using C-style casts.


Repository:
  rL LLVM

https://reviews.llvm.org/D49678





More information about the llvm-commits mailing list