[PATCH] D50117: [llvm-objcopy] NFC: Refactor main objcopy method that takes an ELFReader to a generic Reader.

Jake Ehrlich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 2 11:59:52 PDT 2018


jakehehrlich added a comment.

Speaking of that conflict I mentioned, it won't happen! So we're good to go. This change more or less LGTM. Please resolve Alex's comments however.



================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:206
 static std::unique_ptr<Writer> CreateWriter(const CopyConfig &Config,
-                                            Object &Obj, Buffer &Buf,
-                                            ElfType OutputElfType) {
+                                            const Reader &Reader, Object &Obj,
+                                            Buffer &Buf) {
----------------
jhenderson wrote:
> jakehehrlich wrote:
> > rupprecht wrote:
> > > jhenderson wrote:
> > > > I feel like @jakehehrlich, @alexshap and I had a discussion about this a number of months ago, but I don't really remember what we agreed etc. My current instinct is that a Writer (and the functions used to create it) shouldn't need to know about anything to do with the Reader. This change couples the two together, which I don't really like. I'd rather it be decided outside the CreateWriter function and passed in (as was the case before).
> > > > 
> > > > If the problem is the assumption of needing an "ElfType", maybe we should generalise that enum to not be specific to do with ELFs?
> > > > I feel like @jakehehrlich, @alexshap and I had a discussion about this a number of months ago, but I don't really remember what we agreed etc. 
> > > I've read through the comments at D41687 (Jake pointed me to that one as a starter for this change), but let me know if there's some other discussion thread you had that I should read first.
> > > 
> > > > My current instinct is that a Writer (and the functions used to create it) shouldn't need to know about anything to do with the Reader. This change couples the two together, which I don't really like. I'd rather it be decided outside the CreateWriter function and passed in (as was the case before).
> > > That's true for CreateWriter itself, but currently ExecuteElfObjcopyOnBinary (which is constructs a writer via CreateWriter) has to look at the ELFReader it has created so it knows what elf type to pass.
> > > 
> > > Also -- this problem could be pushed back up another layer, i.e. Writer could be passed into ExecuteElfObjcopyOnBinary instead of ExecuteElfObjcopyOnBinary calling CreateWriter itself... although there is currently another chain of ExecuteElfObjcopyOnBinary -> HandleArgs -> SplitDWOToFile -> CreateWriter that would also need to be plumbed.
> > > 
> > > > 
> > > > If the problem is the assumption of needing an "ElfType", maybe we should generalise that enum to not be specific to do with ELFs?
> > > > 
> > > 
> > > Reading from a binary input should still have an elf type, it will just come from the -B command line flag, instead of being inferred from the object file magic header.
> > > 
> > > However -- refactoring to your suggestion so that CreateWriter doesn't need to know about Reader, I've chosen to change it to Optional<ElfType>. If the input and output are both "binary" (I'm not sure if that's a valid use of objcopy, but it's at least allowed by command line flags), then there is no elf type.
> > My feeling is that the reader and writer should be wholly independent but that you might use information from the reader (or other sources) in order to select which writer you use by default. For instance it should be the case that the same output writer type is chosen for the same input reader by default. If `-I binary` is used  however that goes out the window and we force the user to specify. Indeed if the user specifies anything we should listen to the user.
> > 
> > For instance I thought of a use case for converting from ELF64 to ELF32 (which is something I previously said would never happen). The multi boot specification specifies how to load ELF32 but not ELF64 but all existing tools only support producing ELF64 binaries for 64-bit architectures. Generally this is solved by creating a small 32-bit binary that contains the 64-bit binary as a blob (a la `-O binary` in most cases) but I *think* there's no real reason you can't just convert a 64-bit binary (containing only data and program headers and some meaningless sections like .text) and directly place it in a 32-bit binary. This would work by having the user specify `-O elf32-x86_64` or something like that.
> > 
> > I like passing `Optional<ElfType>` as a means of accomplishing this.
> > I've read through the comments at D41687 (Jake pointed me to that one as a starter for this change), but let me know if there's some other discussion thread you had that I should read first.
> 
> It was a fairly length email chain we had between ourselves off-list. Assuming @jakehehrlich and @alexshap are happy, I see no reason not to forward it to you.
Forwarding it is cool with me The discussion was about a more major refactor. It was a big sticking point of mine that reading and writing not be intertwined but this doesn't rise to that level. The conclusion was a whole new Reader and Writer system where we would make something like a small copy of ELFTypes and then we would copy our non-tempted generic versions of ELFTypes to the libObject ELFTypes but only in the reader and writer. The sections themselves would then take a Writer in their write method and...uh I can't actually remember how the Reader worked but it basically translated libObject's ELFTypes into an internal version of those things (that could later be written using Writer).

There was also agreement on using a kind of internal dependency manager to resolve initialization order issues.


================
Comment at: tools/llvm-objcopy/llvm-objcopy.cpp:575
 
-static void ExecuteElfObjcopyOnBinary(const CopyConfig &Config, Binary &Binary,
+static void ExecuteElfObjcopyOnBinary(const CopyConfig &Config, Reader &Reader,
                                       Buffer &Out) {
----------------
rupprecht wrote:
> jakehehrlich wrote:
> > rupprecht wrote:
> > > jakehehrlich wrote:
> > > > This change (which is not necessarily wrong) conflicts with one of the two proposals I made for how --dump-section should work in another change. After a bit of thought I think "ExecuteElfObjcopyOnBlob" might be needed or something. The --dump-section change hasn't really shaken out just yet so we'll see.
> > > I renamed this to ExecuteElfObjcopyOnBlob since it's no longer always a binary -- is that what you meant?
> > > Do you have a link to your proposals on --dump-section?
> > No I meant for ExecuteElfObjcopyOnBinary to remain unchanged and to not take a Reader at all but instead for a separate almost identical function  ExecuteElfObjcopyOnBlob to exist so that current invocations to ExecuteElfObjcopyOnBinary would have to decide which one to call. The issue is that sometimes there are options (like --dump-section) which only really make sense on the original data. Depending on how that all shakes out we might want to handle that option differently depending on if -I binary or -I elf* was used. One of the options (and the option I'm currently in favor of) for that change would allow us to treat this all uniformly however.
> Sorry, I'm still not following. Do you mean this method should be duplicated, once for ELF binary, once for non-ELF binaries?
> 
> ```
> // Original method, always handles ELF binary input
> static void ExecuteElfObjcopyOnBinary(const CopyConfig &Config, FileBuffer &Binary,
>                                       Buffer &Out) {
>   ELFReader Reader(&Binary);
>   std::unique_ptr<Object> Obj = Reader.create();
> 
>   HandleArgs(Config, *Obj, Reader, Reader.getElfType());
>   // ... maybe some special logic only for ELF binaries ...
> 
>   std::unique_ptr<Writer> Writer =
>       CreateWriter(Config, *Obj, Out, Reader.getElfType());
>   Writer->finalize();
>   Writer->write();
> }
> 
> // Called for -I Binary
> static void ExecuteElfObjcopyOnBlob(const CopyConfig &Config, MemoryBuffer &Blob,
>                                     Buffer &Out) {
>   BinaryReader Reader(&Blob);
>   std::unique_ptr<Object> Obj = Reader.create();
> 
>   ElfType OutputElfType = <... get ELFType from -B flag ...>
>   HandleArgs(Config, *Obj, Reader, ET, OutputElfType);
>   // ... maybe some special logic only for non-ELF binaries ...
> 
>   std::unique_ptr<Writer> Writer =
>       CreateWriter(Config, *Obj, Out, OutputElfType);
>   Writer->finalize();
>   Writer->write();
> }
> ```
Yeah that's what I had in mind but don't do that just yet. Also that might all get templated out in some manner as well. I was just pointing out a potential future issue.


Repository:
  rL LLVM

https://reviews.llvm.org/D50117





More information about the llvm-commits mailing list