[PATCH] D84858: [llvm-libtool-darwin] Refactor ArchiveWriter

Mon Aug 3 12:03:34 PDT 2020

sameerarora101 added inline comments.

================
Comment at: llvm/lib/Object/ArchiveWriter.cpp:673-674
+
+  ErrorOr<std::unique_ptr<MemoryBuffer>> Ret =
+      MemoryBuffer::getOpenFile(Temp->FD, ArcName, -1);
+  if (std::error_code EC = Ret.getError())
----------------
smeenai wrote:
> jhenderson wrote:
> > sameerarora101 wrote:
> > > smeenai wrote:
> > > > smeenai wrote:
> > > > > sameerarora101 wrote:
> > > > > > My current approach involves factoring out the common code of writing the archive to a temporary FD. Now, in case of `writeArchive`, I simply write the temporary file to disk by invoking `Temp->keep(ArcName)`. However, for `writeArchiveBuffer`, I read the archive again in a `MemoryBuffer` using 
> > > > > > ```
> > > > > >   ErrorOr<std::unique_ptr<MemoryBuffer>> Ret =
> > > > > >       MemoryBuffer::getOpenFile(Temp->FD, ArcName, -1);
> > > > > > ```
> > > > > > This memory buffer is then put in `ReturnBuffer` (which is the buffer allocated by the caller to receive the archive buffer).
> > > > > > 
> > > > > > Please let me know if there is a better approach to get the archive in a buffer than this current roundtrip method(writing to a FD and then reading back into a MemBuf).  Thanks!
> > > > > We might be able to do something with raw_svector_ostream. I'll experiment a bit.
> > > > I played with this a bit and got something working, but I'm not convinced it's better. My idea was to make `writeArchiveInFD` take a `raw_ostream &Out` parameter. All the temp file handling shifts into `writeArchive`, which opens the temp file, creates a `raw_fd_ostream` from it, and passes that to `writeArchiveInFD`. `writeArchiveBuffer` takes a `SmallString<0> &` paramater, creates a `raw_svector_ostream` from that, and passes it to `writeArchiveInFD`. It then uses `MemoryBuffer::getMemBuffer` to create a MemoryBuffer from the SmallString and returns that.
> > > > 
> > > > The problem is that the `MemoryBuffer` then doesn't own the memory it's pointing to, which doesn't fit very well with libObject's `OwningBinary` model. You have to manage the lifetime of the SmallString (which actually owns the data) separately, and make sure the MemoryBuffer doesn't outlive it. (You also have to potentially manage multiple such SmallStrings, once for each slice.) We could use `MemoryBuffer::getMemBufferCopy` to have the MemoryBuffer own the contents, but then it's doing a copy, which isn't a whole lot better than what you have here. (It's at least avoiding the temp file, but filesystem caches are also a thing, so I'm not sure how much that ends up mattering in practice.)
> > > Ooh makes sense, thanks a lot for explaining in such detail. 
> > Honestly, I was a bit surprised to see `writeArchiveBuffer` messing about with temp files. The streaming approach would have been much closer to what I expected. That being said, if it's complicated to implement, I'm willing to let it through.
> I just learned about the existence of `SmallVectorMemoryBuffer`, which seems perfect for our use case :)
> 
> @sameerarora101, we can have `writeTemporaryArchive` take a `raw_ostream &Out` parameter. The temporary file creation and handling shifts into `writeArchive`. `writeArchiveBuffer` creates a `SmallVector<char, 0>` and a `raw_svector_ostream` from it, and then returns a `SmallVectorMemoryBuffer` created from its `SmallVector`. That should allow us to be efficient and also avoid any memory management issues :)
oh wow, I'll try this out. Thanks a lot!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84858/new/

https://reviews.llvm.org/D84858