[PATCH] D20211: [ELF] - Support of compressed debug sections creation(draft).

Thu May 12 11:27:56 PDT 2016

On Thu, May 12, 2016 at 10:07 AM, Rui Ueyama via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> ruiu added a comment.
>
> Please add tests to verify that you are producing compressed sections that
> can be read by other tools.
>
>
> ================
> Comment at: ELF/Options.td:33
> @@ +32,3 @@
> +def compress_debug_sections : Joined<["--"], "compress-debug-sections=">,
> +  HelpText<"Force undefined symbol during linking">;
> +
> ----------------
> This help text is out of context.
>
> ================
> Comment at: ELF/OutputSections.cpp:833-837
> @@ +832,7 @@
> +  // compression.
> +  if (Config->CompressDebugSections == "zlib-gabi" ||
> +    Config->CompressDebugSections == "zlib") {
> +    Header.sh_flags |= SHF_COMPRESSED;
> +  }
> +  else {
> +    // Compress all candidate sections using ZLIB compression, using the
> GNU
> ----------------
> Clang-format.
>
> ================
> Comment at: ELF/OutputSections.cpp:842
> @@ +841,3 @@
> +    // renamed to start with.zdebug to identify the use of compression.
> +    StringSaver Saver(ScriptConfig->Alloc);
> +    this->Name = Saver.save(Twine(".z") + Name.drop_front(1));
> ----------------
> Please don't use an unrelated object's members.
>
> ================
> Comment at: ELF/OutputSections.cpp:859-862
> @@ +858,6 @@
> +template <class ELFT> void OutputSection<ELFT>::finalizeCompressed() {
> +  std::vector<uint8_t> Uncompressed(getSize());
> +  uint8_t *Data = Uncompressed.data();
> +  for (InputSection<ELFT> *C : Sections)
> +    C->writeTo(Data);
> +
> ----------------
> This code creates an uncompressed a debug section in memory and then
> compress it at once. It consumes as much memory as the total size of the
> debug section. It is probably unacceptable.
>
> zlib supports "streaming" compression, right? I think you can pass pieces
> of data to zlib object and let it construct compressed data incrementally.
> Please do that way so that the maximum memory is bounded to the largest
> input section size instead of the largest output section size.
>

Would the input be decompressed in a streaming manner as well? So even a
large input section wouldn't be entirely in-memory at once?

(presumably you need to compute the size of the section before writing it?
So you can write other sections in parallel, etc - in which case you might
do a null compression (compressing into /dev/null, of sorts) to compute the
size of the compressed data, then repeating the work to actually write it
out when needed - not sure whether that'll be faster overall... )

FWIW, I think Google (using gold) mostly just compresses the /input/, but
doesn't compress the output on the link. So that's a scenario we care about
- I don't know if that particular combination affects the design in any
particular way, though.

>
> ================
> Comment at: ELF/OutputSections.cpp:869-874
> @@ +868,8 @@
> +
> +  // If not successfully compressed or total size is greater than
> +  // size before applying compression, then cancel compressed output.
> +  size_t TotalSize = Compressed.size() + getCompressionHeaderSize<ELFT>();
> +  Compress = Success == zlib::Status::StatusOK && TotalSize < getSize();
> +  if (!Compress)
> +    return;
> +
> ----------------
> I think we don't have to care about this edge case. Unless you are
> compressing random bits, it usually shrink.
>

Because the compression is per section & some parts of DWARF are pretty
compact, not everything is improved by compression - but it takes pretty
small samples of debug info to not compress, it's true. I did implement
behavior like this (skip compression when not beneficial) in LLVM's
integrated assembler support for compressed debug info.

>
> ================
> Comment at: ELF/OutputSections.cpp:1014-1025
> @@ +1013,14 @@
> +  const endianness E = ELFT::TargetEndianness;
> +  if (Config->CompressDebugSections == "zlib-gnu") {
> +    // 4 bytes magic + size of uncompressed data in big endian.
> +    memcpy(Buf, "ZLIB", 4);
> +    write64be(Buf + 4, UncompressedSize);
> +  } else {
> +    // "zlib-gabi" or "zlib".
> +    write32<E>(Buf, ELFCOMPRESS_ZLIB);
> +    if (ELFT::Is64Bits)
> +      write64<E>(Buf + 2 * sizeof(Elf64_Word), UncompressedSize);
> +    else
> +      write32<E>(Buf + sizeof(Elf32_Word), UncompressedSize);
> +  }
> +
> ----------------
> Why we have two types of headers? If one is obsolete, we probably don't
> want to implement it.
>
> ================
> Comment at: ELF/Writer.cpp:1161-1163
> @@ -1158,2 +1160,5 @@
>
>  template <class ELFT>
> +static bool isCompressionCandidate(InputSectionBase<ELFT> *C,
> +                                   StringRef OutSecName) {
> +  if (Config->CompressDebugSections == "none")
> ----------------
> You can move this to OutputSection class, no?
>
> ================
> Comment at: ELF/Writer.cpp:1172
> @@ +1171,3 @@
> +    return false;
> +  return zlib::isAvailable();
> +}
> ----------------
> This condition should be at the beginning of this function.
>
>
> http://reviews.llvm.org/D20211
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160512/faef3b96/attachment.html>