[PATCH] D20211: [ELF] - Support of compressed debug sections creation(draft).

Thu May 12 14:22:33 PDT 2016

By the way, do you happen to know why there are two header types for
compressed sections?

On Thu, May 12, 2016 at 2:10 PM, Rui Ueyama <ruiu at google.com> wrote:

> On Thu, May 12, 2016 at 2:02 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Thu, May 12, 2016 at 1:02 PM, Rui Ueyama <ruiu at google.com> wrote:
>>
>>> On Thu, May 12, 2016 at 11:27 AM, David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, May 12, 2016 at 10:07 AM, Rui Ueyama via llvm-commits <
>>>> llvm-commits at lists.llvm.org> wrote:
>>>>
>>>>> ruiu added a comment.
>>>>>
>>>>> Please add tests to verify that you are producing compressed sections
>>>>> that can be read by other tools.
>>>>>
>>>>>
>>>>> ================
>>>>> Comment at: ELF/Options.td:33
>>>>> @@ +32,3 @@
>>>>> +def compress_debug_sections : Joined<["--"],
>>>>> "compress-debug-sections=">,
>>>>> +  HelpText<"Force undefined symbol during linking">;
>>>>> +
>>>>> ----------------
>>>>> This help text is out of context.
>>>>>
>>>>> ================
>>>>> Comment at: ELF/OutputSections.cpp:833-837
>>>>> @@ +832,7 @@
>>>>> +  // compression.
>>>>> +  if (Config->CompressDebugSections == "zlib-gabi" ||
>>>>> +    Config->CompressDebugSections == "zlib") {
>>>>> +    Header.sh_flags |= SHF_COMPRESSED;
>>>>> +  }
>>>>> +  else {
>>>>> +    // Compress all candidate sections using ZLIB compression, using
>>>>> the GNU
>>>>> ----------------
>>>>> Clang-format.
>>>>>
>>>>> ================
>>>>> Comment at: ELF/OutputSections.cpp:842
>>>>> @@ +841,3 @@
>>>>> +    // renamed to start with.zdebug to identify the use of
>>>>> compression.
>>>>> +    StringSaver Saver(ScriptConfig->Alloc);
>>>>> +    this->Name = Saver.save(Twine(".z") + Name.drop_front(1));
>>>>> ----------------
>>>>> Please don't use an unrelated object's members.
>>>>>
>>>>> ================
>>>>> Comment at: ELF/OutputSections.cpp:859-862
>>>>> @@ +858,6 @@
>>>>> +template <class ELFT> void OutputSection<ELFT>::finalizeCompressed() {
>>>>> +  std::vector<uint8_t> Uncompressed(getSize());
>>>>> +  uint8_t *Data = Uncompressed.data();
>>>>> +  for (InputSection<ELFT> *C : Sections)
>>>>> +    C->writeTo(Data);
>>>>> +
>>>>> ----------------
>>>>> This code creates an uncompressed a debug section in memory and then
>>>>> compress it at once. It consumes as much memory as the total size of the
>>>>> debug section. It is probably unacceptable.
>>>>>
>>>>> zlib supports "streaming" compression, right? I think you can pass
>>>>> pieces of data to zlib object and let it construct compressed data
>>>>> incrementally. Please do that way so that the maximum memory is bounded to
>>>>> the largest input section size instead of the largest output section size.
>>>>>
>>>>
>>>> Would the input be decompressed in a streaming manner as well? So even
>>>> a large input section wouldn't be entirely in-memory at once?
>>>>
>>>> (presumably you need to compute the size of the section before writing
>>>> it? So you can write other sections in parallel, etc - in which case you
>>>> might do a null compression (compressing into /dev/null, of sorts) to
>>>> compute the size of the compressed data, then repeating the work to
>>>> actually write it out when needed - not sure whether that'll be faster
>>>> overall... )
>>>>
>>>
>>> Yes, if we want to write to the file while compressing sections, we need
>>> to know each section size in compressed format beforehand.
>>>
>>
>> Ah, I understand your comments earlier better then - you just want to
>> stream in from the various inputs but still expect to end up with the whole
>> compressed contents in memory (so the size can be determined before writing
>> it out to the file).
>>
>
> Correct. I just want to avoid creating a whole uncompressed contents and
> compress it, if possible.
>
>
>>
>>
>>> Compressing sections twice is probably too costly.
>>>
>>
>> I wonder if it's possible to more cheaply compute the compressed size
>> than the compressed contents - probably not much I would guess (without
>> much basis for that guess).
>>
>>
>>> We at least need to have an input section in memory.
>>>
>>
>> What do you mean by 'in memory' - memory mapping is potentially different
>> from actually allocating memory for it, right? (one might just look like
>> memory, the other is going to be actual memory - at least that appears to
>> be how our internal memory accounting works - the input files are put on
>> tmpfs, memory mapped in but that doesn't "cost" any of the process's memory
>> because it's allocated before/not by the process)
>>
>
> By in memory, I mean the input section and output section contents need to
> be represented as a byte array whether they are mmap'ed or not.
>
>
>>
>>
>>> We are using mmap'ed IO, and in the first step we copy input sections to
>>> output sections and then apply relocations to the copied data. That process
>>> treats section contents as a byte array.
>>>
>>> FWIW, I think Google (using gold) mostly just compresses the /input/,
>>>> but doesn't compress the output on the link. So that's a scenario we care
>>>> about - I don't know if that particular combination affects the design in
>>>> any particular way, though.
>>>>
>>>
>>> Interesting. I think we don't want to support a feature if it has no
>>> user. Maybe we should support decompressing debug sections, at least as the
>>> first step?
>>>
>>
>> Seems like a good first step to me, but I've no idea what users the
>> author of this patch is trying to support, or what kind of users are out
>> there, etc.
>>
>>
>>>
>>>
>>>> ================
>>>>> Comment at: ELF/OutputSections.cpp:869-874
>>>>> @@ +868,8 @@
>>>>> +
>>>>> +  // If not successfully compressed or total size is greater than
>>>>> +  // size before applying compression, then cancel compressed output.
>>>>> +  size_t TotalSize = Compressed.size() +
>>>>> getCompressionHeaderSize<ELFT>();
>>>>> +  Compress = Success == zlib::Status::StatusOK && TotalSize <
>>>>> getSize();
>>>>> +  if (!Compress)
>>>>> +    return;
>>>>> +
>>>>> ----------------
>>>>> I think we don't have to care about this edge case. Unless you are
>>>>> compressing random bits, it usually shrink.
>>>>>
>>>>
>>>> Because the compression is per section & some parts of DWARF are pretty
>>>> compact, not everything is improved by compression - but it takes pretty
>>>> small samples of debug info to not compress, it's true. I did implement
>>>> behavior like this (skip compression when not beneficial) in LLVM's
>>>> integrated assembler support for compressed debug info.
>>>>
>>>>
>>>>>
>>>>> ================
>>>>> Comment at: ELF/OutputSections.cpp:1014-1025
>>>>> @@ +1013,14 @@
>>>>> +  const endianness E = ELFT::TargetEndianness;
>>>>> +  if (Config->CompressDebugSections == "zlib-gnu") {
>>>>> +    // 4 bytes magic + size of uncompressed data in big endian.
>>>>> +    memcpy(Buf, "ZLIB", 4);
>>>>> +    write64be(Buf + 4, UncompressedSize);
>>>>> +  } else {
>>>>> +    // "zlib-gabi" or "zlib".
>>>>> +    write32<E>(Buf, ELFCOMPRESS_ZLIB);
>>>>> +    if (ELFT::Is64Bits)
>>>>> +      write64<E>(Buf + 2 * sizeof(Elf64_Word), UncompressedSize);
>>>>> +    else
>>>>> +      write32<E>(Buf + sizeof(Elf32_Word), UncompressedSize);
>>>>> +  }
>>>>> +
>>>>> ----------------
>>>>> Why we have two types of headers? If one is obsolete, we probably
>>>>> don't want to implement it.
>>>>>
>>>>> ================
>>>>> Comment at: ELF/Writer.cpp:1161-1163
>>>>> @@ -1158,2 +1160,5 @@
>>>>>
>>>>>  template <class ELFT>
>>>>> +static bool isCompressionCandidate(InputSectionBase<ELFT> *C,
>>>>> +                                   StringRef OutSecName) {
>>>>> +  if (Config->CompressDebugSections == "none")
>>>>> ----------------
>>>>> You can move this to OutputSection class, no?
>>>>>
>>>>> ================
>>>>> Comment at: ELF/Writer.cpp:1172
>>>>> @@ +1171,3 @@
>>>>> +    return false;
>>>>> +  return zlib::isAvailable();
>>>>> +}
>>>>> ----------------
>>>>> This condition should be at the beginning of this function.
>>>>>
>>>>>
>>>>> http://reviews.llvm.org/D20211
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160512/c4e65037/attachment.html>