[PATCH] D130724: [MC] Support writing ELFCOMPRESS_ZSTD compressed debug info sections

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 6 14:08:52 PDT 2022


MaskRay marked 3 inline comments as done.
MaskRay added inline comments.


================
Comment at: llvm/lib/MC/ELFObjectWriter.cpp:861-879
+  ArrayRef<uint8_t> Uncompressed =
       makeArrayRef(reinterpret_cast<uint8_t *>(UncompressedData.data()),
-                   UncompressedData.size()),
-      Compressed);
+                   UncompressedData.size());
+
+  SmallVector<uint8_t, 128> Compressed;
+  uint32_t ChType;
+  switch (MAI->compressDebugSections()) {
----------------
MaskRay wrote:
> dblaikie wrote:
> > MaskRay wrote:
> > > dblaikie wrote:
> > > > MaskRay wrote:
> > > > > dblaikie wrote:
> > > > > > Proliferation of these sort of switch statements is why I'd like to resolve the design discussions before further patches are committed. They're not hugely expensive to fix/change later, but neither are we in a rush to add this functionality, nor is the design discussion especially hard to resolve.
> > > > > > Code something like:
> > > > > > ```
> > > > > > if (CompressionAlgorithm *C = getCompressionAlgorithm(MAI->compressDebugSections()) {
> > > > > >   ChType = C->getELFType(); // alternatively have an array to lookup ELF type from the enum type
> > > > > >   C->compress(Uncompressed, Compressed);
> > > > > > }
> > > > > > ```
> > > > > > Seems like it'd be quite achievable without great complexity & simplify consumers.
> > > > > > 
> > > > > > Possibly even having 'compressDebugSections' carry the algorithm pointer directly? (since the checking's already done up-front, and the pointer would be to an immutable/thread safe/singleton/eternal object anyway)
> > > > > I wish that 15.0.0 would have the llvm-objcopy change. This helps distributions which install llvm binary utilities in /usr/bin which may use newer userland llvm utilities.
> > > > > MC/clang -gz support certainly can go into 16.0.0.
> > > > > 
> > > > > After this MC change, the only remaining place using `DebugCompressionType` is clang -gz. We don't have many ELF object producers, and the object-oriented `CompressionAlgorithm` design kinda makes me feel some over-engineering.
> > > > > 
> > > > > Note: in many (among the few `DebugCompressionType` uses) use cases `isAvailable` and `compress` are called apart.
> > > > I don't think there's any need to rush to get this into any particular LLVM release (compression isn't enabled by default in any case & I don't think this is a pressing issue, is it?).
> > > > 
> > > > I don't think the complexity needed to implement the sort of framing above should be too great (the current patches add a bunch of other things that I think are overly complicating it & making the comparison weight differently - which is why I'm trying to start with the syntax and work backwards to the implementation necessary to provide that interface and not more than that).
> > > > 
> > > > It'd be nice if they didn't have to be so separate (between isAvailable and compress) - so the invariants were expressed at the API level rather than by assertions that something was checked somewhere else.
> > > Say, someone maintains XXXBSD and has /usr/bin llvm binary utilities (base image) from an older release. If I get a binary built from a newer machine with newer clang, I'd hope that I have a way to decompress the debug sections with /usr/bin/llvm-objcopy . Having zstd decompression support at 15.0.0 will help such a use case. Whether LLVM_ENABLE_ZSTD is enabled or not by default in the upstream doesn't really matter. A distribution typically enables many LLVM/CLANG/etc CMake variables to build a toolchain customized to their needs.
> > > 
> > > As mentioned on another thread, the proposed object-oriented design seems to have too much boilerplate. Free functions seem to work quite well. (This is just the classical expression problem and I think different people may have different opinions.)
> > Oh, sorry, by not enabled by default, I was thinking about clang `-gz` not being the default. So I guess there's 3 (maybe 4) levels of "this is not the default" - `LLVM_ENABLE_ZSTD` isn't enabled by default, `-gz` isn't clang's default, and `-gz=zstd` isn't the default even when compression is requested (& the 4th - `-g` isn't the default either, but probably has enough users that we don't count that case).
> > 
> > If someone gets a new LLVM-based tool that isn't a full LLVM release, it won't compress by default and it won't use zstd by default - I think it's OK if their system-installed tools don't handle that novel case (similarly we implemented `-gdwarf-5` long before system debuggers could handle it, we even changed to `-gdwarf-5 by default before system gdb's could handle it, I think).
> > 
> > I don't think there's any need to rush this.
> -gz: There are Linux distributions defaulting to -gz (probably by patching gcc driver).
> 
> -g: Many Linux distributions build packages with -g and ship the non-debug part and debug symbols in separate packages. -g is likely never a default but this argument does not affect distribution usage.
> 
Resolving all comments since this does not catch 15.0.0 and the latest API design discussion happened in other places (D130506 and alternative patches).

I think the switch is useful as it makes clear the code intentionally supports the related compression formats.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130724/new/

https://reviews.llvm.org/D130724



More information about the llvm-commits mailing list