[LLVMdev] [RFC] llvm/include/Support/OutputBuffer.h

Gordon Keiser gkeiser at arxan.com
Tue May 8 16:01:30 PDT 2012


Hi,

Yes.   It could be done fairly easily with memory mapped files, which would probably be the most efficient for this type of buffered access, using MapViewOfFile and FlushViewOfFile.   I'd have to do some speed tests to be sure though.  I'll begin playing around with it soon (probably this weekend, work and all that) and try to determine whether a single streamed write on flush or a memory map ends up being faster.

Cheers,
Gordon

From: Nick Kledzik [mailto:kledzik at apple.com]
Sent: Tuesday, May 08, 2012 6:41 PM
To: Gordon Keiser
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] [RFC] llvm/include/Support/OutputBuffer.h


On May 8, 2012, at 3:52 AM, Gordon Keiser wrote:


FWIW, I'd be interested in working on the Windows implementation.   I've been knee-deep in *nixes lately and wouldn't mind the refresher.   :)
Cool!

Does my proposed interface make sense to implement on top of Windows APIs?

-Nick



From: llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu]<mailto:[mailto:llvmdev-bounces at cs.uiuc.edu]> On Behalf Of Nick Kledzik
Sent: Monday, May 07, 2012 3:57 PM
To: LLVM Developers Mailing List
Subject: [LLVMdev] [RFC] llvm/include/Support/OutputBuffer.h

For the reasons listed in my 03-May-2012 email, I am proposing a new llvm/Support class for using in writing binary files:

/// OutputBuffer - This interface provides simple way to create an in-memory
/// buffer which when done will be written to a file. During the lifetime of
/// these  objects, the content or existence of the specified file is undefined.
/// That is, creating an OutputBuffer for a file may immediately remove the
/// file.
/// If the OutputBuffer is committed, the target file's content will become
/// the buffer content at the time of the commit.  If the OutputBuffer is not
/// committed, the file will be deleted in the OutputBuffer buffer destructor.
class OutputBuffer {
public:
  enum Flags {
    F_executable = 1, /// set the 'x' bit on the resulting file
  };

  /// Factory method to create an OutputBuffer object which manages a read/write
  /// buffer of the specified size. When committed, the buffer will be written
  /// to the file at the specified path.
  static error_code createFile(StringRef filePath, Flags flags, size_t size,
                               OwningPtr<OutputBuffer> &result);


  /// Returns a pointer to the start of the buffer.
  uint8_t *bufferStart();

  /// Returns a pointer to the end of the buffer.
  uint8_t *bufferEnd();

  /// Returns size of the buffer.
  size_t size();

  /// Flushes the content of the buffer to its file and deallocates the
  /// buffer.  If commit() is not called before this object's destructor
  /// is called, the file is deleted in the destructor. The optional parameter
  /// is used if it turns out you want the file size to be smaller than
  /// initially requested.
  void commit(int64_t newSmallerSize = -1);
};


The Flags will probable need to be extended over time to handle other clients needs.

For Unix/Darwin, my plan is to implement this by:
1) delete the file
2) create a new file with a random name in same directory
3) truncate the file to the new size
4) mmap() in the file r/w
5) On commit, unmap the file, rename() to final name
6) In destructor, if not committed, unmap, delete the randomly named file

I'll leave the windows implementation empty and let someone with windows experience do the implementation.

Comments? Suggestions?

-Nick


On May 3, 2012, at 6:10 PM, Nick Kledzik wrote:
Existing llvm code tends to use raw_ostream for writing files.  But raw_ostream is not a good match for a linker for a couple of reasons:

1) When the linker creates an executable, the file needs the 'x' bit set.  Currently raw_fd_ostream has no way to set that.

2) The Unix conformance suite actually has some test cases where the linker is run and the output file does exists but is not writable, or is not writable but is in a writable directory, or with funky umask values.   raw_fd_ostream interface has no way to match those semantics.

3) On darwin we have found the linker performs better if it opens the output file, truncates it to the output size, then mmaps in the file, then writes directly into that memory buffer.  This avoids the memory copy from the private buffer to the OS file system buffer in the write() syscall.

4) In the model we are using for lld, a streaming output interface is not optimal.   Currently, lld copies chunks of code from the (read-only) input files, to a temporary buffer, then applies any fixups (relocations), then streams out that temporary buffer.  If instead we had a big output buffer, the linker could copy the code chunks directly to the output buffer and apply the fixups there, avoiding an extra copy.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120508/e95cd403/attachment.html>


More information about the llvm-dev mailing list