[PATCH] D51641: [VFS] Cache the current working directory for the real FS.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 9 02:49:33 PST 2019


sammccall added a comment.

In D51641#1338004 <https://reviews.llvm.org/D51641#1338004>, @labath wrote:

> Might I ask what was the motivation for this change? Performance optimalization?


Yes, this was for performance.
IIRC the problem was something like calling `VFS->makeAbsolute(P)` for every filename in a large PCH.

This cache indeed breaks two related assumptions that used to hold:

1. `VFS->getCurrentWorkingDirectory()` is consistent with how paths are interpreted by `VFS->openFileForRead()` etc
2. `VFS->getCurrentWorkingDirectory()` is consistent with the process's CWD: `::getcwd`, `llvm::sys::fs::current_path`, etc

The first should clearly hold, so this cache certainly introduced a bug. And probably a hard one to track down, sorry :-(
However, I don't think the second should actually be guaranteed here (see below...)

> I am asking this because this makes the RealFileSystem return bogus values for the CWD if it changes through means other than the VFS::setCurrentWorkingDirectory (which is something that, as a library, we can never rule out). We ran into problems with this in lldb where our tests use lldb as a library, and they are driven by a python scripts (which does some CWD changing as a part of test discovery).
> 
> If I understand the long scary comment in the `setCurrentWorkingDirectory` function correctly, the RealFileSystem wants to be independent of the OS notion of CWD. However, right now it's not doing either job very well (sharing the CWD with the OS nor being independent of it), because all other RFS functions will use the real OS CWD (as it is at the moment of the call), only `getCurrentWorkingDirectory` will return whatever was the last CWD set through the appropriate setter.

Right. In order to write correct multithreaded programs that use the `FileSystem` abstraction, the VFS's working directory needs to be a local attribute of that VFS, not something global.
However the current implementation of `RealFileSystem` doesn't satisfy this. It is fixable on most OSes (on recent unix, store a FD for the working dir and use `openat()`) but nobody was burned badly enough to fix it yet I think. With this fix, clearly no cache would be required.

This would imply that RealFileSystem would always return the working directory from when it was created, and all other function would use that working directory too.
Does that seem reasonable? (Does it make it easier to fix the code that got broken by this cache?)

I can try to get an `openat()`-based implementation ready.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D51641/new/

https://reviews.llvm.org/D51641





More information about the cfe-commits mailing list