[cfe-dev] LLVM & Clang file management

Manuel Klimek klimek at google.com
Mon Nov 28 13:04:19 PST 2011


On Mon, Nov 28, 2011 at 9:07 PM, Daniel Dunbar <daniel at zuster.org> wrote:
> Hi Manual,
>
> I'm +2 on the general idea.
>
> I have had various thoughts in this direction as well (although no
> implementation). See:
>  http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-July/009903.html
> for my RFC from last year (focused at bug reporting, but involved
> defining a VFS layer).

Cool, that sounds like another use case very similar to our replaying
at scale use case.

> My one main implementation level comment is I don't think FileManager
> is the right API layer to abstract at (it is too specific to Clang's
> usage, and too hard to propagate through the rest of LLVM). My
> intuition is that it is better to set out to define a lower level VFS
> layer that is rich enough to support everything we do and the vagaries
> of Win32/Unix, but is otherwise minimal.

What about FileManager is too high level / too clang specific? The
uniquing logic? The possibility to add in stats caches?
Do you think we'd want to have a CachingFileSystem on top of the VFS
layer? That would sound more orthogonal, on the other hand FileManager
is doing pretty OS-specific stuff to unique the inodes where possible.

> One requirement I hope any proposed VFS design will support is
> emulating Win32 on Unix (and vice versa), which imposes assorted API
> complications but I think is worth it overall.

I'm not sure I understand what you mean with "emulating win32"? I'd
hope to get win32 / unix stuff hidden behind the VFS; do you expect
that not to be possible performance wise?

Cheers,
/Manuel

> I see many positive future technologies we could build if we had a
> good VFS layer, I'd absolutely love to see work in this direction.
>
>  - Daniel
>
> On Mon, Nov 28, 2011 at 2:49 AM, Manuel Klimek <klimek at google.com> wrote:
>> Hi,
>>
>> while working on tooling on top of clang/llvm we found the file system
>> abstractions in clang/llvm to be one of the points that could be nicer
>> to integrate with. I’m writing this mail to propose a strawman and get
>> some feedback on what you guys think the right way forward is (or
>> whether we should just leave things as they are).
>>
>> First, the FileManager we have in clang has helped us a lot for our
>> tooling - when we run clang in a mapreduce we don’t need to lay out
>> files on a disk, we can just map files into memory and happily clang
>> over them. We’re also using the same mechanism to map builtin
>> includes; in short, the FileManager has made it possible to do clang
>> at scale.
>>
>> Now we’re aware that it was not really the intention of the
>> FileManager to allow doing the things we do with it: not every module
>> in clang uses the FileManager, and the moment we hit llvm there is no
>> FileManager at all. For example, in case of the Driver we hack around
>> the fact that the header search tries to access the file system
>> driectly in rather brittle ways, relying on implementation details and
>> #ifdefs.
>>
>> So why not make FileManager a more principled (and still blazing fast)
>> file system abstraction?
>> Pro:
>> - only one interface for developers to learn on the project (no more
>> PathV1 vs PathV2 vs FileManager)
>> - only one implementation (per-platform) for easier maintenance of the
>> file system platform abstraction
>> - one point to insert synchronization guarantees for tools / IDE
>> integration that wants to run clang in multiple threads at once (for
>> example when re-indexing on 12-ht-core machines)
>> - being able to replay compilations by injecting a virtual file system
>> that exactly “copies” the original file system’s content, which allows
>> easy scaling of replays, running tools against dirty edit buffers on a
>> lower level than the SourceManager and unit testing
>>
>> Con:
>> - there would be yet another try at unifying the APIs which would be
>> in an intermediate state while being worked on (and PathV1 vs PathV2
>> is already bad enough)
>> - making it the canonical file system interface is a lot of effort
>> that requires touching a lot of systems (while we’re volunteering to
>> do the work, it will probably eat up other people’s time, too)
>>
>> What parts (if any) of this type of transition makes sense?
>> 1. Figure out the “correct” interface we’d want for FileManager to be
>> more generally useful
>> 2. Change FileManager to that interface
>> 4. Sink FileManager into llvm, so it can be used by other projects
>> 4. Use it throughout clang
>> 5. Use it throughout llvm
>> We don’t need to do all of them at once, and should be able to
>> evaluate the results along the way.
>>
>> Thoughts? If folks are generally happy, I’d start up an email thread
>> to drive the target design of the FileManager to get things rolling.
>>
>> /Manuel
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>




More information about the cfe-dev mailing list