[PATCH] D42222: [llvm-objcopy] Refactor llvm-objcopy to use reader and writer objects

Mon Jan 22 11:56:23 PST 2018

jakehehrlich added inline comments.

================
Comment at: tools/llvm-objcopy/Object.cpp:1060
+  BufPtr = std::move(*BufferOrErr);
+  SecWriter = llvm::make_unique<SectionWriter<ELF64LE>>(*BufPtr);
 }
----------------
jhenderson wrote:
> jakehehrlich wrote:
> > jhenderson wrote:
> > > This line looks a little dodgy. Doesn't that mean we'll write ELF64LE sections to binary output, no matter what the input was?
> > Ah yes, I remember having this dilemma before pushing it to the side in favor of progress. I was thinking about splitting up SectionWriter into a non-ELFT-templated base class and a separate ELFT-template base class and then *also* a binary section writer (that is non-ELFT-templated) that throws errors for the methods that need ELFT to output correctly. How does that sound?
> I think that would make things a little clearer, yes. I'd like to see which methods you're referring to that should throw errors, and why we don't need them for a binary section writer though.
Non-allocated relocations, SHT_SYMTAB, and GnuDebugLinkSections (which are non-allocated as well)  are the only ones that need to be written as such. There are many section types that should never make into the binary output that these methods would not throw errors on (string tables for instance) but there are no cases in which the above 3 sections should be output for -O binary. That way if this ever comes up and we get an error it either means a) the code is messed up or b) someone marked .symtab or .gnu-debuglink as allocated.

================
Comment at: tools/llvm-objcopy/Object.h:503-505
+  // We keep this around so that Objects can be constructed from objects owned
+  // by the binary object and they won't be deallocated.
+  OwningBinary<Binary> Binary;
----------------
jhenderson wrote:
> jakehehrlich wrote:
> > jhenderson wrote:
> > > This will require us to always have the ELFReader allocated until we write, which seems a little less than ideal?
> > I want to keep it so that the data between the Object and the Binary is not duplicated. I do agree that it doesn't seem like the Reader shouldn't *have* to stay open the whole time but I'd like the Binary to stay open the whole time (until all output files have been written at least). I think this can be remedied in two opposite ways. One way is for the Reader to maintain ownership of the Objects so that the Binary and the Objects that reference it's data have a shared lifetime. The other idea is that this could perhaps be decoupled and the Reader could store a reference to the Binary. It then becomes the user's job to make sure that the Bianry stays open as long as any object created by the Reader. Right now it's (oddly) the user's job to make sure that the Reader stays open as like as any object created by it. I agree that's wrong.
> > 
> > What are your thoughts?
> I'm actually wondering whether we could transfer ownership of either the Reader, or just the Binary associated with it to the Object. That way the original Reader can be disposed of at any point after creating the Object. Unfortunately, this doesn't play particularly well with multiple input types. Of the two ideas you suggested I'd prefer that Readers have ownership of their Objects.
I wonder if there's a way to use reference counting here? I can't find a way to make a binary shared_ptr but that would be the time to use it if there ever was one. I played around with Readers owning Objects though and it can be enforced really well using references. Basically objects are stored in a vector and references to them are returned on creation (I haven't considered how collection of these objects should occur). If the ELFBuilder (which is owned by the ELFReader) owns the binary then we don't even need to use unique pointers to allocate objects (they can just be allocated in the vector).

Repository:
  rL LLVM

https://reviews.llvm.org/D42222