[PATCH] D49678: [llvm-objcopy] Adding support for compressed DWARF debug sections.

Wed Aug 22 12:45:01 PDT 2018

plotfi marked an inline comment as done.
plotfi added a comment.

In https://reviews.llvm.org/D49678#1209775, @jakehehrlich wrote:

> Ok so a few higher level comments
>
> 1. I think I agree with Alex that we shouldn't worry about compressing only if an improvement is shown. It's just far too difficult to do that at the moment. We can come back to that later. This seems to be a new case of "replace section" that I had been aware of the need for before but hand't properly implemented.
> 2. We should start simple and build up to parity and only build up to parity based on demand. In this case we shouldn't aim for full parity. Just start simple and if that isn't enough we can improve more later. If something isn't ideal that's ok for now.
> 3. Along those lines, I think we should handle compression and decompression in different diffs. We really need to rain in the complexity here.
> 4. The whole compressor decompressor thing in libObject feels misguided to me personally. Namely it isn't handling the subtle issue of size and endianess correctly (in fact, I think llvm-objcopy's whole handling of size is currently incorrect but this isn't helping). Luckily for you I have little to no say in the matter because this is outside of llvm-objcopy. Now that I realize this is in libObject (whooooops, my bad guys) I'd like that part to be in a separate change and reviewed by other people. You can cc me or even add me as a reviewer. If I were a reviewer for it I would like to see evidence for its need outside of llvm-objcopy since I don't think it's needed here.
> 5. This change is simply far more complex that it needs to be to accomplish the task. It's getting better and better though. We've been trying to make micro-pushes in the right direction, and we're getting there, but this really needs to be restructured somewhat instead. For instance compressSections and decompressSections are starting to look like something that should go in the final code, all the argument handling seems good to me as well. The means by which compressed sections are being handled is not in that shape. In order to express the massive change I think I'd like to see I've written the following code:
>
>   ``` class CompressedSection { private: uint64_t UncompressedSize; uint64_t UncompressedAlign; uint32_t Type; SmallVector<char, 0> CompressedData; bool GNU; public: CompressedSection(Twine NewName, const SectionBase& ToCompress, bool GNU) : GNU(GNU) { Name = NewName; zlib::compress(toStringRef(ToCompress.OriginalData),  CompressedData); Size = CompressedData.size() + <some constant to account for size of header> + (GNU ? 4 : 0); // Needed mostly because I didn't handle Size correctly in llvm-objcopy. Sorry. I plan on getting back to llvm-objcopy to fix many things this included in a few weeks. Flags = ToCompress.Flags | SHF_COMPRESS; Align = 8; // Also I think alignment also changes depending on size here and an alignment of 4 would be better for the 32-bit case. // Assign all the other things from ToCompress except size and align UncompressedSize = ToCompress.Size; UncompressedAlign = ToCompress.Align; } }
>
>   ...
>
>   template <class ELFT> void ELFSectionWriter<ELFT>::visit(const CompressedSection &Sec) { uint8_t *Buf = Out.getBufferStart(); Buf += Sec.Offset; auto Chdr = reinterpret_cast<typename ELFT::Chdr*>(Buf); Chdr->ch_type = ELFCOMPRESS_ZLIB; Chdr->ch_size = Sec.UncompressedSize; Chdr->ch_align = Sec.UncompressedAlign; Buf += sizeof(*Chdr) if (Sec.GNU) { StringRef Magic = "ZLIB"; Buf = std::copy(Magic.begin(), Magic.end(), Buf); } std::copy(Sec.CompressedData.begin(), Sec.CompressedData.end(), Buf); }
>
>   ```
>
>   This means you'll not need to mess with ELFT at all beyond that one function, it follows what we've been doing with these sorts of issues, and it eliminates the need for Compressor. Decompression is a different issue mind you. There is currently an issue with how reading works. I want to improve that. In a nutshell though it's going to need to be done in the ELFBuilder.
>
>   The major pain point here is just that size has to be overestimated due to poor design on my part. Align also seems to suffer from this issue which isn't something I was previously aware of. This should avoid the need for 90% of the code written.

So your preference here is to taking code from the Compressor and sticking it in the visitor? I'm fine with altering the structure to always compress in the visitor rather than making a determination based on size improvements in HandleArgs but I would prefer if the Compressor library is still separate (which is templated by ELFT) so that I can use it to refactor redundant code in other parts of llvm.

================
Comment at: lib/Object/Compressor.cpp:17
+namespace llvm {
+namespace object {
+
----------------
plotfi wrote:
> jakehehrlich wrote:
> > jhenderson wrote:
> > > jakehehrlich wrote:
> > > > This probably shouldn't be defined in object since that's a library namespace, maybe objcopy instead?
> > > I nearly made the same mistake myself, but this is actually IN the Object library, so is in the correct namespace!
> > Ohhhh. Yeah I think I want to put the kibosh on doing this in this change. For the purposes of llvm-objcopy and this change I don't think we want or need this. If this is truly needed at large then someone else should review it.
> Yeah, this is in the object library which is the same place Decompressor is. 
Alright, I will put the compressor library in a separate review diff. I do not want to write compression in llvm-objcopy because I see a lot of potential for refactoring redundant code all over llvm for this. 

So what exactly is your preference specifically?

Repository:
  rL LLVM

https://reviews.llvm.org/D49678