[cfe-dev] "devirtualizing" files in the VFS

Whisperity via cfe-dev cfe-dev at lists.llvm.org
Thu Nov 15 03:34:33 PST 2018


I am really not sure if adding real file system functionality strictly into
the VFS is a good approach. This "ExternalFileSystem" thing sounds weird to
me.

Does LLDB need to *write* the files through the VFS? I'm not sure perhaps a
"WritableVFS" could be implemented, or the necessary casting/conversion
options.

In case:
 - there is a real path behind the file --- you could spawn an
llvm::RealFileSystem (the fqdn might not be this after the migration
patch!) and use that to obtain the file's buffer.

How can you be sure the file actually exists on the FS? That's what the VFS
should be all about, hiding this abstraction... if you *are* sure it
exists, or want to make sure, you need to pull the appropriate realFS from
the VFS Overlay (most tools have an overlay of a memoryFS above the realFS).

What I am not sure about is extending the general interface in a way that
it caters to a particular (or half of a particular) use case.

For example, in CodeCompass, we used a custom VFS implementation that
"hijacked" the overlay and included itself between the realFS and the
memoryFS. It obtains files from the database!

See:
https://github.com/Ericsson/CodeCompass/blob/a1a7b10e3a9e2e4f493135ea68566cee54adc081/plugins/cpp_reparse/service/src/databasefilesystem.cpp#L191-L224

These files *do not necessarily* (in 99% of the cases, not at all) exist on
the hard drive at the moment of the code wanting to pull the file, hence
why we implemented this to give the file source buffer from DB. The
ClangTool that needs this still gets the memoryFS for its own purposes, and
for the clang libraries, the realFS is still under there.

Perhaps the "Status" type could be extended to carry extra information?
https://github.com/Ericsson/CodeCompass/blob/a1a7b10e3a9e2e4f493135ea68566cee54adc081/plugins/cpp_reparse/service/src/databasefilesystem.cpp#L85-L87

Sam McCall via cfe-dev <cfe-dev at lists.llvm.org> ezt írta (időpont: 2018.
nov. 15., Cs, 12:02):

> I'd like to get some more perspectives on the role of the
> VirtualFileSystem abstraction in llvm/Support.
> (The VFS layer has recently moved from Clang to LLVM, so crossposting to
> both lists)
>
> https://reviews.llvm.org/D54277 proposed adding a function to
> VirtualFileSystem to get the underlying "real file" path from a VFS path.
> LLDB is starting to use VFS for some filesystem interactions, but
> wants/needs to keep using native IO (FILE*, file descriptors) for others.
> There's some more context/discussion in the review.
>
> My perspective is coloured by work on clang tooling, clangd etc. There we
> rely on VFS to ensure code (typically clang library code) works in a
> variety of environments, e.g:
> in an IDE the edited file is consistently used rather than the one on disk
> clang-tidy checks work on a local codebase, but our code review tool also
> runs them as a service
> This works because all IO goes through the VFS, so VFSes are
> substitutable. We tend to rely on the static type system to ensure this
> (most people write lit tests that use the real FS).
>
> Adding facilities to use native IO together with VFS works against this,
> e.g. a likely interface is
>   // Returns the OS-native path to the specified virtual file.
>   // Returns None if Path doesn't describe a native file, or its path is
> unknown.
>   Optional<string> FileSystem::getNativePath(string Path)
> Most potential uses of such a function are going to produce code that
> doesn't work well with arbitrary VFSes.
> Anecdotally, filesystems are confusing, and most features exposed by VFS
> end up getting misused if possible.
>
> So those are my reasons for pushing back on this change, but I'm not sure
> how strong they are.
> I think broadly the alternatives for LLDB are:
> make a change like this to the VFS APIs
> migrate to actually doing IO using VFS (likely a lot of work)
> know which concrete VFSes they construct, and track the needed info
> externally
> stop using VFS, and build separate abstractions for tracking remapping of
> native files etc
> abandon the new features that depend on this file remapping
>
> As a purist, 2 and 4 seem like the cleanest options, but that's easy to
> say when it's someone else's work.
> What path should we take here?
>
> Cheers, Sam
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181115/3da24916/attachment.html>


More information about the cfe-dev mailing list