[cfe-dev] RFC: A virtual file system for clang

Manuel Klimek klimek at google.com
Sat Feb 8 10:21:12 PST 2014


On Fri, Feb 7, 2014 at 11:28 PM, Ben Langmuir <blangmuir at apple.com> wrote:

>
> On Feb 7, 2014, at 1:57 PM, Manuel Klimek <klimek at google.com> wrote:
>
> On Fri, Feb 7, 2014 at 9:20 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>
>>
>> On Feb 7, 2014, at 11:31 AM, Manuel Klimek <klimek at google.com> wrote:
>> >
>> > I'd like to consider it design-wise at least on a straw-man level (to
>> be shot down ;)
>> >
>> > If we want to support this in the future, it might affect both the
>> ownership and the API design question. For example, one design straw-man
>> would be to have interfaces for FileSystem, Directory and File, where
>> FileSystem can give you Directory's and those again can give you Files.
>> That would trivially support using directory-entry based OS interfaces
>> where they are available, but would mean more code overhead per FileSystem
>> implementation.
>> >
>> > A different approach would be to use descriptors for files and
>> directories, and have only a single FileSystem interface that can handle
>> the instances. Seems slightly less "nice" from a user point of view, but
>> potentially simpler and less overhead for the implementation.
>> >
>> > From the other mails in this thread it sounds to me more like you want
>> to basically punt on those questions and just provide the interface and
>> access methods to get buffers for files. That might also be fine for now,
>> but I'd prefer if it is a conscious decision rather than an accidental one
>> :)
>>
>> What filesystem modifications would you consider to be ‘unrelated’ in a
>> compilation?  E.g. what if clang is invoked from the ‘real location’
>> src-head, but there are command-line options that refer to absolute paths
>> in ‘src’.  Would we try to recover?  I think I would need a concrete idea
>> of what the desired model of consistency is before I could be convinced
>> this is a solvable problem.
>>
>
> The proposed model is basically implemented today - as long as all paths
> you give to the compiler are relative subpaths of the cwd, you'll stay
> relative to the same directory inode.
>
>
> Is this because FileManager caches the DirectoryEntries along the way?
>

Nope, it's because clang just stays in one directory, and that basically
keeps all relative file operations inside the original working directory.


>
> Our main problem today is when we think about multi-threading the
> compilation (mainly for tooling - I want to be able to parse multiple TUs
> from within one program). There, just chdir'ing into the right directory
> for the TU doesn't work any more.
>
>
> I’m still not sure I get what you’re trying to solve.  Is it to ensure
> that multiple threads have a consistent picture of the file system,
> similarly to if they all shared a FileManager, even in the presence of a
> chdir?
>

Yep.


>
>
> Which brings me to a different question: multi-threading; since your main
> use case is remapping on the lowest level, do you think you'll want
> different mappings per TU?
>
>
> I don’t have a good answer to this yet.  My plan was to start by having
> the compiler instance own the virtual file system, and see where that got
> us.
>

I like the plan (similar to what I'd intuitively have done).


>
>
>
>> Now, specifically about the straw-man proposal: one of the things I like
>> about reusing the llvm::sys::fs interface is that it makes the change very
>> small for users.  Both of these options seem to require giving that up for
>> a handle-based API where clients need to think about a new object(s) for
>> any file operations they want to use.
>>
>
> Just for the record: I'm totally in favor of modeling stuff after the
> llvm:sys::fs interface - that was the same plan we came up with back in the
> day (but always were able to work around spending time on).
>
>
> Cool - I’ll probably post an initial patch based on this approach in the
> near future as a first step.
>

Great! Looking forward to it :)


>
>
>
>> I wonder if handling the clang-invocation-location issue can be solved by
>> virtually mapping `pwd` to `pwd -P` right at the beginning of the
>> compilation?  That would allow path-based operations to still work.
>>  However, that probably doesn’t scale if you want to treat *every* path
>> that the compiler looks up this way, since you would explode the number of
>> mappings…  So it really depends on what our model of consistency is.
>>
>
> I tried to explain above. Does that make sense?
>
> Cheers,
> /Manuel
>
>
>>
>> Ben
>>
>> >
>> > Cheers,
>> > /Manuel
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140208/964c5b79/attachment.html>


More information about the cfe-dev mailing list