[llvm-dev] "devirtualizing" files in the VFS

Jonas Devlieghere via llvm-dev llvm-dev at lists.llvm.org
Thu Nov 15 09:45:25 PST 2018


HI Sam,

Thanks again for taking the time to discuss this. 

> On Nov 15, 2018, at 3:02 AM, Sam McCall <sammccall at google.com> wrote:
> 
> I'd like to get some more perspectives on the role of the VirtualFileSystem abstraction in llvm/Support.
> (The VFS layer has recently moved from Clang to LLVM, so crossposting to both lists)
> 
> https://reviews.llvm.org/D54277 <https://reviews.llvm.org/D54277> proposed adding a function to VirtualFileSystem to get the underlying "real file" path from a VFS path. LLDB is starting to use VFS for some filesystem interactions, but wants/needs to keep using native IO (FILE*, file descriptors) for others. There's some more context/discussion in the review.
> 
> My perspective is coloured by work on clang tooling, clangd etc. There we rely on VFS to ensure code (typically clang library code) works in a variety of environments, e.g:
> in an IDE the edited file is consistently used rather than the one on disk
> clang-tidy checks work on a local codebase, but our code review tool also runs them as a service
> This works because all IO goes through the VFS, so VFSes are substitutable. We tend to rely on the static type system to ensure this (most people write lit tests that use the real FS).

I want to emphasize that I don't have any intention of breaking any of those or other existing use cases. I opted for the virtual file system because it provides 95% of the functionality that's needed for reproducers: the real filesystem and the redirecting file system. It has the yaml mapping writer and reader, the abstraction level above the two, etc. It feels silly to implement everything again in LLDB (actually it would be more like copy/pasting everything) just because we miss that 5%, so I'm really motivated to find a solution that works for all of us :-) 

> Adding facilities to use native IO together with VFS works against this, e.g. a likely interface is
>   // Returns the OS-native path to the specified virtual file.
>   // Returns None if Path doesn't describe a native file, or its path is unknown.
>   Optional<string> FileSystem::getNativePath(string Path)
> Most potential uses of such a function are going to produce code that doesn't work well with arbitrary VFSes.
> Anecdotally, filesystems are confusing, and most features exposed by VFS end up getting misused if possible.

You're right and this is a problem/limitation for LLDB as well. This was the motivation for the `ExternalFileSystem` (please forgive me for the terrible name, just wanted to get the code up in phab) because it had "some" semantic meaning for both implementations. But I also understand your concerns there. 

> So those are my reasons for pushing back on this change, but I'm not sure how strong they are.
> I think broadly the alternatives for LLDB are:
> make a change like this to the VFS APIs
> migrate to actually doing IO using VFS (likely a lot of work)
> know which concrete VFSes they construct, and track the needed info externally
> stop using VFS, and build separate abstractions for tracking remapping of native files etc
> abandon the new features that depend on this file remapping

Can you elaborate on what you have in mind for (3) and how it differs from (4)?

> As a purist, 2 and 4 seem like the cleanest options, but that's easy to say when it's someone else's work.
> What path should we take here?

I'll withhold from answering this as I'm one of the stakeholders ;-) 

> 
> Cheers, Sam

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181115/4801392a/attachment.html>


More information about the llvm-dev mailing list