[PATCH] Some infrastructure work for virtual file system

Ben Langmuir blangmuir at apple.com
Tue Feb 11 14:27:29 PST 2014


On Feb 11, 2014, at 1:24 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com> wrote:

> 
> On Feb 11, 2014, at 1:13 PM, Ben Langmuir <blangmuir at apple.com> wrote:
> 
>> 
>> On Feb 11, 2014, at 12:23 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com> wrote:
>> 
>>> On Feb 10, 2014, at 2:32 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>>> 
>>>> Hi Doug et al.
>>>> 
>>>> This is the first patch in the implementation of the virtual file system discussed here: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-February/035188.html.  It is intended to include enough of the AbstractFileSystem interface to replace the use of llvm::sys::fs in FileManager. Incidentally, I’ve included a number of the convenience methods that are trivially implemented on top of the status object.  When created by the CompilerInstance, FileManager will use a virtual file system created by the CI, but to avoid modifying every other user of the FileManager, in the absence of an explicit AbstractFileSystem to use, it will default to the real file system during construction.
>>>> 
>>>> Additionally, I’ve included a stub implementation of the OverlayFileSystem, which the CompilerInstance uses.  It is able to overlay files from a second file system, but will not merge directories correctly yet.  There are no users that actually push a second file system yet (other than the included unit test).
>>>> 
>>>> This patch is notably missing: the interfaces for recursive directory traversal, file system modification (create dir, rename, etc.), and any notion of thread safety.
>>>> 
>>>> Ben
>>>> 
>>>> <vfs-part1.patch>
>>>> 
>>> 
>>> 
>>> Looks good! 
>>> 
>>> To add on the nitpicking.. :-)
>>> 
>>> +  /// In addition to providing the usual stat information, this class provides
>>> +  /// \p Name and \p RealName fields, which can be used by clients that need
>>> +  /// to differentiate between the name that this object was looked up by,
>>> +  /// and the real name of this object on disk (if there is one).
>>> 
>>> Could you clarify what should happen if there is no real path (it's a virtual file), is RealName == Name or is it empty ?
>> 
>> RealName == “”
> 
> Hmm, if we do RealName == Name it will simplify a bit the clients that care (e.g. what to put in FileEntry), they will just use RealName, instead of doing 
> 
> 	if (!RealName.empty())
>            use(RealName)
>        else
>           use(Name)
> vs
>       use(RealName)
> 
> We could have another way to determine if the file is virtual or not, if there is a client that may care.

Yes, I think this has shown that trying to use names to differentiate virtual/non-virtual is a bad idea. I’m going to change RealName to ExternalName, meaning: this is the name you should use to refer to this file externally, such as in diagnostics and debug info.  Then to be explicit:

1) real file: Name == ExternalName (and ExternalName exists on disk)
2) fully virtual file: Name == ExternalName (and ExternalName does *not* exist on disk)
3) virtual link to real file: Name != ExternalName (and ExternalName exists on disk)

As you say, we can add a bit if we ever actually want to differentiate 1 and 2.

> 
>> 
>> I’ll update the comment.
>> 
>>> 
>>> 
>>>>> 
>>>>> I just thought that we might need to resolve symlinks ourselves,
>>>>> here's why.  Suppose that we have a real FS:
>>>>> 
>>>>> /foo
>>>>> /foo/bar/bar.h
>>>>> /foo/baz/baz.h -> symlink to ../bar/bar.h
>>>>> 
>>>>> Then we push an overlay over /foo/bar.  If we rely on OS-level open(),
>>>>> then opening baz/baz.h will open bar.h from the real FS, but it is
>>>>> hidden by the overlay, thus we have an inconsistency.
>>>> 
>>>> Good catch. It looks like the overlay file system will need to stat every path prefix and then resolve links itself when it finds them.  That’s unpleasant, but doable.  Looks like we’ll need to add a read link function to sys::fs… which might be a pain on Windows.  I’m reading the documentation on Windows file system reparse points, and it’s not nearly as easy as calling ::readlink.
>>> 
>>> I'm concerned about the performance implications of stat'ing everything and trying to resolve symlinks, and I'm not sure how bad the issue will be in practice.
>>> Could we defer investigating into this for after we have the simpler case working ?
>> 
>> With caching I don’t think the performance will be a big problem, but we will have to see.  I’m fine with deferring this.
>> 
>>> 
>>> On a related note, what about using a filename with different casing from what it was declared in the vfs ('myfoo.h' vs 'MyFoo.h').
>>> 
>> 
>> 
>> Since the overlay defers to other file systems, I don’t think it needs to care.  When we add a file system for vfs files, we probably need an option to do case-insensitive matching.  I’m not sure if that is a property of the host or just a property of the filesystem though…
> 
> To clarify, I was asking about what the “virtual layout” implementation is going to do. Maybe have case-insensitive or not be an optional parameter, but if that parameter is not set, use as default case-insensitive on Darwin and case-sensitive on linux.

Sounds good to me.

> 
>> 
>> Ben

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140211/0b6ef7d5/attachment.html>


More information about the cfe-commits mailing list