[PATCH] D49678: [llvm-objcopy] Adding support for compressed DWARF debug sections.

Mon Aug 27 15:13:43 PDT 2018

jakehehrlich added inline comments.

================
Comment at: tools/llvm-objcopy/Object.h:336
+protected:
+  std::string OwnedName;
   std::vector<uint8_t> Data;
----------------
plotfi wrote:
> jakehehrlich wrote:
> > plotfi wrote:
> > > jakehehrlich wrote:
> > > > plotfi wrote:
> > > > > jakehehrlich wrote:
> > > > > > So it turns out that we want to make names std::string instead of StringRefs anyway so this is no longer needed. (two other changes also needed this). That will likely be landing soon.
> > > > > ETA?
> > > > I think that's happening in this change: https://reviews.llvm.org/D50463
> > > > 
> > > > I think Paul is finished with GSoC so it's up to Paul. You can go ahead and use code from that change in this change so that the rebase will be minimal.
> > > As far as I can tell, this is not comitted yet. Should I wait for this to land or take it from the diferential you sent?
> > Given that Paul hasn't worked out that and how close this patch is to getting an LGTM from me I think you should be the one to make that change in this patch.
> I took a look at this patch (D50463). As far as I can tell the one addition is the StartsWith method. I am not sure we need it here unless you have a big preference to add it.
Sound fine to me. I think we use it enough places and will continue to use it that it might be helpful to have a function that does something like that for us. If you want to use StringRef(...).startswith(...) I'm also fine with that for now.

================
Comment at: tools/llvm-objcopy/Object.h:369
+public:
+  CompressedSection(const SectionBase &Sec,
+                    DebugCompressionType CompressionType)
----------------
plotfi wrote:
> jakehehrlich wrote:
> > plotfi wrote:
> > > jakehehrlich wrote:
> > > > The Size needs to be set in the constructor. Prior to all the fancy changes that people are making now. Most Sizes could be quite trivially determined. Size is used in layout. Because I didn't foresee Size needing to be determined dynamically for different systems, it's just a single variable when it really needs to be a virtual function that takes some magical argument that will allow it to know the proper size for a given output format. The issues here haven't been fully fleshed out but I think a solution that Alex and James proposed to solve other issues and not just this one)could be made to solve this. For the time being, the best option is to use an upper bound. It sucks; I dislike it, but I don't have a better recommendation.
> > > > 
> > > > I should finally get the time I need to refactor this in a couple weeks. I should be able to solve a lot of these overarching issues then.
> > > > 
> > > > In the meantime, I think this needs to compress the data within the section so that it can get a reasonable upper bound on the size.
> > > Oh you're saying that CompressedSection::Size should be the compressed size, not the decompressed size??? At the moment I am doing compression in the visitor. 
> > > 
> > > I still dont understand why size needs to be passed as a constructor parameter. Isn't that going to come from the Section I am passing in anyways?
> > Yep. To be precise. It's the field that becomes st_size in the section header so it has to be correct for that. Unfortunately what st_size is is not simply dependent on the size of compressed data but also on the size of of Chdr which you can't actually know the way things are. The current contract states that Size is the one field which must be correct at all times. This is because layout has to happen prior to finalization but layout needs to know the size. Thinking about it now, alignment actually must obey the same contract but that was less obvious since alignment is normally just unmodified.
> > 
> > There are two sizes that matter here, the size of the original data and the size after compression plus the header and magic string. Overestimating is acceptable, but not ideal.
> > 
> > You don't need to pass in the size of the original data since you already pass in the section and the OriginalData field contains that information. You *can't* pass in the compressed size because you don't yet know that information. Since Size must be set correctly on the exit of each method that also means that size must be correct after construction. So I don't really see how to get around compressing the data in the constructor. Plus I think that helps solve the problem of minimizing the number of copies that have to be made (namely, you can avoid ever copying the data)
> Problem is, I cant exactly get the new size without the ELFT template parameter. I could always over estimate based on the upper-bound of the compressed size. 
Right. I've been trying to call attention to that as a problem of how Size currently works. This is a fundamental design issue I overlooked that I plan to fix (and should be able to in a couple weeks). Overestimating by the difference between 32-bit and 64-bit Chdr isn't a huge loss. Overestimating by the difference between compressed and uncompressed is something I'd like to avoid.

Repository:
  rL LLVM

https://reviews.llvm.org/D49678