[PATCH] D88827: [llvm-objcopy][NFC] Move core implementation of llvm-objcopy into separate library.

Mon Oct 12 01:58:35 PDT 2020

jhenderson added a comment.

In D88827#2322005 <https://reviews.llvm.org/D88827#2322005>, @echristo wrote:

> I like the direction this is going, I'll take more of a deep look soon, but wanted to ask: "should this be in Object rather than a separate library?" When I'd originally asked for this to be split into its own library I'd thought that it would get added into libobject.
>
> Thoughts?
>
> -eric

One of the ideals would be to have writable versions of the various ObjectFile classes defined by the Object library. I know this was worked on last year as part of GSOC by @abrachet, but it didn't really end up getting to a usable point. I kind of like the separation of concerns here - the proposed Objcopy library would handle manipulation of object files, whilst the Object library is primarily for parsing and inspecting them. The former builds on the latter, but the latter doesn't need to care about the former. Thus a user who wrote an object dumping tool wouldn't need the Objcopy library. However, I don't have a strong opinion on this, so if a good design could be presented to resolve that, I'd be happy.

In D88827#2319015 <https://reviews.llvm.org/D88827#2319015>, @alexshap wrote:

> I have some general comments / concerns (in addition to the inline comment).
> The interface of the library is important and once it's committed and people start using the library in multiple places it might be harder to make changes / fix issues
> (unfortunately this has already happened in LLVM a few times in the past) .

Could we maybe put a big note at the top of the library header saying that the API is still work-in-progress and shouldn't be used, until we've got to the final point? I feel like moving the code first is a good step and then we can iterate on the design once it's in place (I agree that CopyConfig probably needs more work).

================
Comment at: llvm/include/llvm/ObjCopy/Buffer.h:23-24
 // abstract interface and doesn't depend on a particular implementation.
 // TODO: refactor the buffer classes in LLVM to enable us to use them here
 // directly.
 class Buffer {
----------------
avl wrote:
> avl wrote:
> > jhenderson wrote:
> > > Maybe as part of a separate patch, it would be worth taking a look at this TODO. It would be great if the Buffer could be removed from the library API and generic LLVM buffers used instead (for example an in-memory buffer or a file buffer, depending on what people want to do).
> > agreed, but I think it is better to do in separate patch.
> @alexshap @jhenderson 
> 
> Speaking of class Buffer refactoring. I do not think that this comment is fully valid:
> 
> 
> ```
> // The class Buffer abstracts out the common interface of FileOutputBuffer and
> // WritableMemoryBuffer so that the hierarchy of Writers depends on this
> // abstract interface and doesn't depend on a particular implementation.
> // TODO: refactor the buffer classes in LLVM to enable us to use them here
> // directly.
> ```
> 
> It suggests to create some common interface for FileOutputBuffer and WritableMemoryBuffer.
> Which is assumed to look similar to this: 
> 
> 
> ```
> class Buffer {
>   StringRef Name;
> 
> public:
>   virtual ~Buffer();
>   virtual Error allocate(size_t Size) = 0;
>   virtual uint8_t *getBufferStart() = 0;
>   virtual Error commit() = 0;
> 
>   explicit Buffer(StringRef Name) : Name(Name) {}
>   StringRef getName() const { return Name; }
> };
> 
> ```
> There exists a problem with methods commit() and allocate(). commit() is a redundant for WritableMemoryBuffer.
> adding it to WritableMemoryBuffer would require to patch all current usages of WritableMemoryBuffer.
> So it looks incorrectly to use it for common parent interface of FileOutputBuffer and WritableMemoryBuffer.
> allocate() suggests another way of buffer creation. Currently, buffers are created by static creation methods:  
> 
> 
> ```
> static std::unique_ptr<WritableMemoryBuffer>
> getNewMemBuffer(size_t Size, const Twine &BufferName = "");
> 
> static Expected<std::unique_ptr<FileOutputBuffer>>
> create(StringRef FilePath, size_t Size, unsigned Flags = 0);
> ```
> 
> Adding "virtual Error allocate(size_t Size)" would lead to creation empty buffer by static creation method 
> and then call to allocate(). This does not seem a good addition to the already existed FileOutputBuffer and WritableMemoryBuffer. FileOutputBuffer and WritableMemoryBuffer assume another use cases than Buffer.
> 
> Actually, what is neccessary by objcopy is method:
> 
> 
> ```
> uint8_t *createBuffer ( size_t Size ); 
> ```
> 
> All other functionality is redundant and could be removed from objcopy.
> 
> what do you think about following design?
> 
> 
> ```
> LazyBuffer {
>   StringRef Name;
> 
>   virtual StringRef getName() const { return Name; }
>   virtual uint8_t *createBuffer ( size_t Size ) = 0;
> };
> 
> MemoryLazyBuffer : public LazyBuffer {
>   virtual uint8_t *createBuffer ( size_t Size ) {
>     Buffer = WritableMemoryBuffer::getNewMemBuffer(Size, Name);
>     return Buffer->getBufferStart();
>   }
> 
>   std::unique_ptr<WritableMemoryBuffer> Buffer;
> };
> 
> FileLazyBuffer : public LazyBuffer {
>   virtual uint8_t *createBuffer ( size_t Size ) {
>     Buffer = FileOutputBuffer::create(Name, Size);
>     return Buffer->getBufferStart();
>   }
> 
>   std::unique_ptr<FileOutputBuffer> Buffer;
> };
> ```
> 
> Usage:
> 
> 
> ```
> static Error executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In,
>                                     LazyBuffer &Out);
> 
> 
> MemoryLazyBuffer MB("name");
> executeObjcopyOnBinary(Config, Input, MB);
> 
> FileLazyBuffer MB("name");
> executeObjcopyOnBinary(Config, Input, MB);
> if (MB.Buffer)
>   MB.Buffer->commit();
> 
> ```
> 
> 
> Another alternative is that library always writes to general MemoryBuffer :
> 
> static Expected<std::unique_ptr<MemoryBuffer>> executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In);
> 
> and later this MemoryBuffer would be written into the file by llvm-objcopy.cpp.
> 
> What do you think?
> 
I'm not sure I've looked at the intricacies of the different buffer classes to know what the right approach is. However, if modifying the existing buffers seems like it won't work/will be too invasive, this seems like a fair approach (it's the "Adapter" design pattern in action, I believe).

I'm not sure I'd call it `LazyBuffer`, although I don't have a specific better name (maybe just ObjcopyBuffer, depending on how generic we want it to be). Also, unless the objcopy code actually needs the name for anything (aside from error messages, I'm not sure what that would be), I'd not include that in the interface.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88827/new/

https://reviews.llvm.org/D88827