[PATCH] D63453: [Support] Move llvm::MemoryBuffer to sys::fs::file_t

Reid Kleckner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 18 09:48:39 PDT 2019


rnk marked 3 inline comments as done.
rnk added a subscriber: silvas.
rnk added a comment.

In D63453#1547446 <https://reviews.llvm.org/D63453#1547446>, @Bigcheese wrote:

> Where do you want to keep these open?  The default OS limit on open files for OSX is 256.


In COFF LLD.

By the way, that is... disturbingly low. How is this not already a huge problem? Is this why clang doesn't keep open FDs to the header files it maps? That in itself seemed like a bug to me. It invites TOCTOU issues. Users pretty routinely race with the compiler while saving files in their editors.

Right now COFF LLD tries to mmap files asynchronously or in parallel, and that adds a lot of complexity. If I remove that complexity, then I should be able to open each input once, one at a time, get the inode / unique id, close the file if already open, map it, and continue.



================
Comment at: llvm/include/llvm/Support/FileSystem.h:963
+/// no-op.
+file_t convertFDToNativeFile(int FD);
+
----------------
aganea wrote:
> Why not use `file_t` everywhere? Do we still need POSIX FD's in user code? I find it error-prone having to juggle between the two.
No reason, other than that there's a lot of code to migrate. The refactoring is also made slightly more difficult by the move to use `llvm::Error` instead of `std::error_code`.


================
Comment at: llvm/lib/Support/Windows/Path.inc:738
+std::error_code status(file_t FileHandle, file_status &Result) {
+  return getStatus(FileHandle, Result);
+}
----------------
aganea wrote:
> A bit unrelated, but worth mentioning: I've noticed that `status()` (and thus `getStatus()`) came up in profiles as slower-than-they-should-be. Each call to `status()` issues 5 kernel calls and opens the file by the same occasion. In comparaison, MSVC uses the file metadata through the `FindFirstFile/FindNextFile` API, which is a lot faster. When using precompiled headers, this makes a difference because Clang calls `status()` on every file referenced by the PCH, whereas MSVC simply iterates through folders (for a given CPP using precompiled headers, MSVC was taking ~250 ms to compile, and Clang about ~1200 ms). Clang has some level of caching, but even at that it goes through the `status()` call. One of the ideas was maybe to hide a directory caching mechanism behind this `status()` API instead of letting it go directly to the OS layer.
I think @silvas was telling me more or less the same thing based on his experience at Sony. He had a similar suggestion. At this point, I think it would be best to add the cache to the VirtualFileSystem layer, since then the cache won't be global.


================
Comment at: llvm/lib/Support/Windows/Path.inc:1211
+
+std::error_code readNativeFile(file_t FileHandle, MutableArrayRef<char> Buf,
+                               size_t *BytesRead) {
----------------
aganea wrote:
> Just keep the API below that "works" all the time? (`readNativeFileSlice`)
Are you suggesting simplifying the code by forwarding the call with offset 0, or reducing the API to one entry point?

Stream devices like pipes don't support seeking or pread, so to change the public API, I would need to make the offset optional, which I think would be less clear. Perhaps I can simplify this Windows specific code to delegate to the slice version, but that was my concern.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63453/new/

https://reviews.llvm.org/D63453





More information about the llvm-commits mailing list