[cfe-dev] RFC: A builder for CompilerInstance

Duncan Exon Smith via cfe-dev cfe-dev at lists.llvm.org
Tue Mar 26 17:13:35 PDT 2019


I have the beginnings of a patch for a CompilerInstance builder.  I'd like some early feedback before going further.

## Questions ##

- Is a builder API a good idea?
- Should it look something like this?  If not, what?
- What should be in it?
- How should it be staged?

## Motivation ##

My motivation for looking at this is a little tangential: while working on the InMemoryModuleCache recently (e.g., r355778) I realized that this cache should be leveraging the FileManager to dedup paths by inode.  However when I tried a naïve fix I was getting crashes, and I had trouble understanding FileManager/VFS lifetime and ownership.  Ultimately, I'm hoping to clarify what the invariants are around an instance of CompilerInstance and its FileManager.

## Is a builder API a good idea? ##

A builder API seems a little cleaner than how we currently manage the boilerplate in constructing a CompilerInstance.

The status quo is that each client default-constructs and then calls various mutators like `setInstance` to set it up.  However, this API allows clients to swap in different data structures after construction.  If clients don't need or use that flexibility, the code for maintaining that flexibility will be brittle, some combination of broken and confusing.

For example, I cleaned up some confusing logic in r357037 in CompilerInstance::setFileManager and CompilerInstance::createFileManager that were trying to adapt to a CompilerInstance's VFS and FileManager changing at arbitrary times and keep them in sync (in that case, it was easy: just use FileManager's VFS directly).

With a builder API, we can lock down CompilerInstance mutators so that it's clear what is actually allowed to change mid-flight and what isn't.  It's probably also easier to handle "obvious defaults" in a sane way for clients that only want to specify one or two things (e.g., unit tests).

## Should it look something like this?  If not, what? ##

Here's the WIP patch I have:
https://reviews.llvm.org/differential/diff/192381/
(also attached below, but without the -U9999999 context)

The core idea is to remove `CompilerInstance::set*` and `CompilerInstance::create*` in cases where a CompilerInstance never needs more than one version of a data structure over its lifetime.  So far the patch removes CompilerInstance::setInvocation, CompilerInstance::setDiagnostics, and CompilerInstance::createDiagnostics.

The meat is in two files:
- CompilerInstanceBuilder.h (https://reviews.llvm.org/differential/diff/192381/#change-AjNDpfG5S3AZ): a move-only data structure for collecting the parameters for building a CompilerInstance.
- CompilerInstance.cpp (https://reviews.llvm.org/differential/diff/192381/#change-wS2d87BZOVCm): logic to process those parameters at CompilerInstance construction time.

Note on one semantic change I made: CompilerInstance *always* has a DiagnosticsEngine now.  I did that because there were tons of sites calling CompilerInstance::createDiagnostics with no arguments.  Originally I had a zero-parameter version of CompilerInstanceBuilder::diags that controlled whether to create one, but just doing it all the time seemed to improve the code.  My suspicion is that the DiagnosticsEngine is not intended to be optional, it's just hard to sort out how to construct one (without a builder API).

The rest of it all looks like removal of boilerplate to me.

Thoughts on this direction?

## What should be in it? ##

My personal goal here is to pull in the prerequisites for removing CompilerInstance::setFileManager.  It's possible this won't be possible, but if not I'll figure out why not in the process.

What else should be in there?  Here are some options:

- FileManager (I hope so!)
- SourceManager (I think so!)
- Preprocessor?
- TargetInfo?
- Sema (probably not, it seems truly optional)
- ModuleManager (probably not?)

## How should it be staged? ##

Assuming this directly looks good, I'm not sure how to stage this.  Should I create a monolithic patch and get it reviewed as a whole, or post a patch for a skeletal builder and then patches for adding arguments to it incrementally?

The latter is the usual approach, and is what I've been following locally.  I have two patches:

- Add a builder that handles the current constructor arguments: PCHContainerOps and InMemoryModuleCache.
- Add a CompilerInvocation and DiagnosticsEngine to the builder and remove createDiagnostics, setDiagnostics, and setInvocation.  The construction of these two is quite intertwined so there didn't seem to be a clean way to do this separately.

I'm hesitating slightly since I'll be touching roughly the same lines of code in every incremental patch, but I'm still inclined to go that way so that incremental progress doesn't bitrot.

Thanks for any feedback!
Duncan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: wip-compiler-instance-builder.patch
Type: application/octet-stream
Size: 49564 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190326/3fb8d701/attachment.obj>


More information about the cfe-dev mailing list