[LLVMdev] LLVM & Clang file management

Manuel Klimek klimek at google.com
Mon Nov 28 02:49:25 PST 2011


Hi,

while working on tooling on top of clang/llvm we found the file system
abstractions in clang/llvm to be one of the points that could be nicer
to integrate with. I’m writing this mail to propose a strawman and get
some feedback on what you guys think the right way forward is (or
whether we should just leave things as they are).

First, the FileManager we have in clang has helped us a lot for our
tooling - when we run clang in a mapreduce we don’t need to lay out
files on a disk, we can just map files into memory and happily clang
over them. We’re also using the same mechanism to map builtin
includes; in short, the FileManager has made it possible to do clang
at scale.

Now we’re aware that it was not really the intention of the
FileManager to allow doing the things we do with it: not every module
in clang uses the FileManager, and the moment we hit llvm there is no
FileManager at all. For example, in case of the Driver we hack around
the fact that the header search tries to access the file system
driectly in rather brittle ways, relying on implementation details and
#ifdefs.

So why not make FileManager a more principled (and still blazing fast)
file system abstraction?
Pro:
- only one interface for developers to learn on the project (no more
PathV1 vs PathV2 vs FileManager)
- only one implementation (per-platform) for easier maintenance of the
file system platform abstraction
- one point to insert synchronization guarantees for tools / IDE
integration that wants to run clang in multiple threads at once (for
example when re-indexing on 12-ht-core machines)
- being able to replay compilations by injecting a virtual file system
that exactly “copies” the original file system’s content, which allows
easy scaling of replays, running tools against dirty edit buffers on a
lower level than the SourceManager and unit testing

Con:
- there would be yet another try at unifying the APIs which would be
in an intermediate state while being worked on (and PathV1 vs PathV2
is already bad enough)
- making it the canonical file system interface is a lot of effort
that requires touching a lot of systems (while we’re volunteering to
do the work, it will probably eat up other people’s time, too)

What parts (if any) of this type of transition makes sense?
1. Figure out the “correct” interface we’d want for FileManager to be
more generally useful
2. Change FileManager to that interface
4. Sink FileManager into llvm, so it can be used by other projects
4. Use it throughout clang
5. Use it throughout llvm
We don’t need to do all of them at once, and should be able to
evaluate the results along the way.

Thoughts? If folks are generally happy, I’d start up an email thread
to drive the target design of the FileManager to get things rolling.

/Manuel




More information about the llvm-dev mailing list