<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 12, 2016 at 2:57 PM, Rui Ueyama <span dir="ltr"><<a href="mailto:ruiu@google.com" target="_blank">ruiu@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="h5">On Thu, May 12, 2016 at 2:40 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 2:29 PM, Rui Ueyama <span dir="ltr"><<a href="mailto:ruiu@google.com" target="_blank">ruiu@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 2:22 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 2:10 PM, Rui Ueyama <span dir="ltr"><<a href="mailto:ruiu@google.com" target="_blank">ruiu@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 2:02 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 1:02 PM, Rui Ueyama <span dir="ltr"><<a href="mailto:ruiu@google.com" target="_blank">ruiu@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 11:27 AM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Thu, May 12, 2016 at 10:07 AM, Rui Ueyama via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">ruiu added a comment.<br>
<br>
Please add tests to verify that you are producing compressed sections that can be read by other tools.<br>
<br>
<br>
================<br>
Comment at: ELF/Options.td:33<br>
@@ +32,3 @@<br>
+def compress_debug_sections : Joined<["--"], "compress-debug-sections=">,<br>
+ HelpText<"Force undefined symbol during linking">;<br>
+<br>
----------------<br>
This help text is out of context.<br>
<br>
================<br>
Comment at: ELF/OutputSections.cpp:833-837<br>
@@ +832,7 @@<br>
+ // compression.<br>
+ if (Config->CompressDebugSections == "zlib-gabi" ||<br>
+ Config->CompressDebugSections == "zlib") {<br>
+ Header.sh_flags |= SHF_COMPRESSED;<br>
+ }<br>
+ else {<br>
+ // Compress all candidate sections using ZLIB compression, using the GNU<br>
----------------<br>
Clang-format.<br>
<br>
================<br>
Comment at: ELF/OutputSections.cpp:842<br>
@@ +841,3 @@<br>
+ // renamed to start with.zdebug to identify the use of compression.<br>
+ StringSaver Saver(ScriptConfig->Alloc);<br>
+ this->Name = Saver.save(Twine(".z") + Name.drop_front(1));<br>
----------------<br>
Please don't use an unrelated object's members.<br>
<br>
================<br>
Comment at: ELF/OutputSections.cpp:859-862<br>
@@ +858,6 @@<br>
+template <class ELFT> void OutputSection<ELFT>::finalizeCompressed() {<br>
+ std::vector<uint8_t> Uncompressed(getSize());<br>
+ uint8_t *Data = Uncompressed.data();<br>
+ for (InputSection<ELFT> *C : Sections)<br>
+ C->writeTo(Data);<br>
+<br>
----------------<br>
This code creates an uncompressed a debug section in memory and then compress it at once. It consumes as much memory as the total size of the debug section. It is probably unacceptable.<br>
<br>
zlib supports "streaming" compression, right? I think you can pass pieces of data to zlib object and let it construct compressed data incrementally. Please do that way so that the maximum memory is bounded to the largest input section size instead of the largest output section size.<br></blockquote><div><br></div></div></div><div>Would the input be decompressed in a streaming manner as well? So even a large input section wouldn't be entirely in-memory at once?<br><br>(presumably you need to compute the size of the section before writing it? So you can write other sections in parallel, etc - in which case you might do a null compression (compressing into /dev/null, of sorts) to compute the size of the compressed data, then repeating the work to actually write it out when needed - not sure whether that'll be faster overall... )<br></div></div></div></div></blockquote><div><br></div></div></div><div>Yes, if we want to write to the file while compressing sections, we need to know each section size in compressed format beforehand. </div></div></div></div></blockquote><div><br></div></div></div><div><div>Ah, I understand your comments earlier better then - you just want to stream in from the various inputs but still expect to end up with the whole compressed contents in memory (so the size can be determined before writing it out to the file).</div></div></div></div></div></blockquote><div><br></div></div></div><div>Correct. I just want to avoid creating a whole uncompressed contents and compress it, if possible.</div></div></div></div></blockquote><div><br></div></div></div><div>*nod* agreed, if possible</div><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div> </div></div><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Compressing sections twice is probably too costly.</div></div></div></div></blockquote><div><br></div></span><div>I wonder if it's possible to more cheaply compute the compressed size than the compressed contents - probably not much I would guess (without much basis for that guess).<br></div><span><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>We at least need to have an input section in memory.</div></div></div></div></blockquote><div><br></div></span><div>What do you mean by 'in memory' - memory mapping is potentially different from actually allocating memory for it, right? (one might just look like memory, the other is going to be actual memory - at least that appears to be how our internal memory accounting works - the input files are put on tmpfs, memory mapped in but that doesn't "cost" any of the process's memory because it's allocated before/not by the process)</div><span><div></div></span></div></div></div></blockquote><div><br></div></span><div>By in memory, I mean the input section and output section contents need to be represented as a byte array whether they are mmap'ed or not.</div></div></div></div></blockquote><div><br></div></span><div>The output section does? That seems like it'd add some memory overhead compared to just writing out the parts from the memory mapped inputs into the right offsets in the output. (at least based on my limited experience implementing the DWP tool - which mostly just wants to concatenate sections together) </div></div></div></div></blockquote><div><br></div></div></div><div>If all we have to do is to just copy data from files to an output file, then non-mmap'ed IO may be faster because I guess it involves fewer context switches between the user and the kernel. But we need to apply relocations after copying data, and relocation application is basically a random accesse to an output section.</div></div></div></div></blockquote><div><br></div></div></div><div>Sorry, I'm probably lacking enough context to contribute correctly to this conversation - trying to piece it together.<br><br>It sounded like what you were saying is that you form the output section, complete with applied relocations, in memory (in a new or malloc'd buffer) first, then write that out. I found this surprising because of the memory overhead it would incur.<br><br>Instead you could (perhaps you do - and I just misunderstood above) walk the inputs, figure out how big the resulting section would be (& keep a list of where the chunks of it came from/what offsets they should be written to) then just read/write from input section to the right part of the output section. For relocations, you could then walk into the memory mapped output & rewrite the desired bytes (or you could copy the contiguous fragments from the inputs that appear between the relocations - then the relocations themselves)<br></div></div></div></div></blockquote><div><br></div></div></div><div>That is exactly what we are currently doing. Sorry if my description was not clear or misleading. We copy section contents from mmap'ed input files to a mmap'ed output file, and after that we fix up desired bytes in the mmap'ed output section.</div><div><br></div><div>What I was suggesting in this review comment is this. In order to create a compressed section, this patch creates an in-memory, non-mmap'ed buffer whose size is the same as the entire output section. It then writes all debug section into it and compress it. That seems a bit costly as we don't actually have to create that large buffer. Each input section knows how to copy itself to a specified memory address (an input section is agnostic about whether the destination is mmap'ed region or not) and how to apply relocations to the copied contents. The two are actually just one operation in LLD -- which is InputSection::writeTo(uint8_t *Dest) function. Therefore, we can call writeTo() for each input section with a smaller buffer, append the buffer to a zlib stream, and then reuse the buffer for the next section.</div></div></div></div></blockquote><div><br></div><div>Ah, OK, thanks for the details, that all makes sense.<br><br>(& at some point, if we got fancy, InputSection::writeTo(uint8_t *Dest) could be changed to be optionally stream-like so it could be used to stream out to a compressor and avoid the need to manifest the entire input section in a new buffer - but that'd be an optimization to try later when someone decided to get around to it)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="h5"><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>To bring this back around to the original issue - this would make linking compressed debug info better/good. If you have to concatenate all the debug info together into one big buffer before writing it out that seems unfortunate compared to having the separate decompressed fragments (I suppose you could decompress them all into the same buffer - so there might not be any memory overhead, but you'd need a larger contiguous chunk of memory for it). This becomes even clearer in a non-compressed scenario - where the output is just the concatenation of bits of the input without relocations in much of the data. Having to form the entire output content in a memory buffer first would seem like a lot of overhead to me. (again, this is based on my experience trying to improve the memory overhead of the DWP tool which may not be entirely applicable to general purpose linking for a bunch of reasons)<br></div><div><div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><span style="color:rgb(80,0,80)"> </span></div><div><div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> We are using mmap'ed IO, and in the first step we copy input sections to output sections and then apply relocations to the copied data. That process treats section contents as a byte array.<br></div><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>FWIW, I think Google (using gold) mostly just compresses the /input/, but doesn't compress the output on the link. So that's a scenario we care about - I don't know if that particular combination affects the design in any particular way, though.</div></div></div></div></blockquote><div><br></div></span><div>Interesting. I think we don't want to support a feature if it has no user. Maybe we should support decompressing debug sections, at least as the first step?</div></div></div></div></blockquote><div><br></div></span><div>Seems like a good first step to me, but I've no idea what users the author of this patch is trying to support, or what kind of users are out there, etc.</div><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
================<br>
Comment at: ELF/OutputSections.cpp:869-874<br>
@@ +868,8 @@<br>
+<br>
+ // If not successfully compressed or total size is greater than<br>
+ // size before applying compression, then cancel compressed output.<br>
+ size_t TotalSize = Compressed.size() + getCompressionHeaderSize<ELFT>();<br>
+ Compress = Success == zlib::Status::StatusOK && TotalSize < getSize();<br>
+ if (!Compress)<br>
+ return;<br>
+<br>
----------------<br>
I think we don't have to care about this edge case. Unless you are compressing random bits, it usually shrink.<br></blockquote><div><br></div></span><div>Because the compression is per section & some parts of DWARF are pretty compact, not everything is improved by compression - but it takes pretty small samples of debug info to not compress, it's true. I did implement behavior like this (skip compression when not beneficial) in LLVM's integrated assembler support for compressed debug info.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div><div>
<br>
================<br>
Comment at: ELF/OutputSections.cpp:1014-1025<br>
@@ +1013,14 @@<br>
+ const endianness E = ELFT::TargetEndianness;<br>
+ if (Config->CompressDebugSections == "zlib-gnu") {<br>
+ // 4 bytes magic + size of uncompressed data in big endian.<br>
+ memcpy(Buf, "ZLIB", 4);<br>
+ write64be(Buf + 4, UncompressedSize);<br>
+ } else {<br>
+ // "zlib-gabi" or "zlib".<br>
+ write32<E>(Buf, ELFCOMPRESS_ZLIB);<br>
+ if (ELFT::Is64Bits)<br>
+ write64<E>(Buf + 2 * sizeof(Elf64_Word), UncompressedSize);<br>
+ else<br>
+ write32<E>(Buf + sizeof(Elf32_Word), UncompressedSize);<br>
+ }<br>
+<br>
----------------<br>
Why we have two types of headers? If one is obsolete, we probably don't want to implement it.<br>
<br>
================<br>
Comment at: ELF/Writer.cpp:1161-1163<br>
@@ -1158,2 +1160,5 @@<br>
<br>
template <class ELFT><br>
+static bool isCompressionCandidate(InputSectionBase<ELFT> *C,<br>
+ StringRef OutSecName) {<br>
+ if (Config->CompressDebugSections == "none")<br>
----------------<br>
You can move this to OutputSection class, no?<br>
<br>
================<br>
Comment at: ELF/Writer.cpp:1172<br>
@@ +1171,3 @@<br>
+ return false;<br>
+ return zlib::isAvailable();<br>
+}<br>
----------------<br>
This condition should be at the beginning of this function.<br>
<br>
<br>
<a href="http://reviews.llvm.org/D20211" rel="noreferrer" target="_blank">http://reviews.llvm.org/D20211</a><br>
<br>
<br>
<br></div></div>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div><br></div></div>