[cfe-dev] Checker Plugins

Fri Jul 29 22:24:06 PDT 2011

I'm happy to keep revising this; I'm hoping for a good constraint-satisfying solution, and being pointed in the right direction and then let go is fine.

My original design used llvm::Registry, with the registered object implementing a single method that was basically a clang_getCheckersForEntirePlugin. Each plugin then had its own CheckerProvider doing the same sort of name-based package inference that ClangSACheckerProvider does.

Argyrios then pointed out that llvm::Registry doesn't really work well on Windows. I don't entirely understand the thread at http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-May/thread.html#14886 , but it sounds like you have to explicitly instantiate the llvm::Registry templates (more work than ideal). So his idea (which I think is a good one) was to move to a single entry point for each plugin.

To make a detour, the reasons I decided on clang_getCheckersForPackage_PACKAGE instead of clang_getCheckers_PLUGIN were:
(1) it made groups easy -- just delegate
(2) it's uniform; it could work just as well for internal packages as well as external
(3) no string matching against plugin full-names -- the only string manipulation here is splitting off the name of the plugin itself and constructing the name of the function entry point
(4) possibly more efficient (cost of a symbol lookup vs. cost of comparing all possible plugin names)

The disadvantages:
(1) requires redundant information if a /single/ checker is in multiple packages
(2) probably requires a little more memory
(3) still requires a single entry point (a list of packages) for -analyzer-checker-help
(4) somewhat complex compared to a single "here are all my checkers"

It's worth noting that CheckerPackage was entirely an internal helper class for CheckerPackageProvider. It's just a wrapper around the list of classes in a package. Neither CheckerPackage nor Checker are printing anything. :-)

Still, I guess I'm going at this the wrong way. If I change it back to the "one list per plugin" version, what should it look like from the client side?

void clang_getCheckers_PLUGIN (CheckerInfo::List &checkers) {
  checkers.push_back(CheckerInfo::create<MyChecker>("myplugin.MyChecker", "description goes here"));
  checkers.push_back(CheckerInfo::create<AnotherChecker>("secret.MyChecker", "description goes here"));
}

Advantages:
(1) Only one function per plugin, ever.
(2) Multiple plugins can add to the same package. (possibly a disadvantage?)

Disadvantages:
(1) Child packages are easy, but cross-package groups are impossible
(2) Namespace clashes are more likely (two plugins with the same file name)

This basically what my original code allowed, but without llvm::Registry. Argyrios's original suggestion was that the clang_getCheckers_PLUGIN equivalent should return a CheckerProvider, but I think even that is work the plugin writer shouldn't need to worry about. There's no reason we can't create the checker provider clang-side. (Actually, there's no reason there can't just be a SINGLE CheckerProvider that gathers EVERY plugin's checkers.)

(A note: If DynamicLibrary was shored up to handle private_extern symbols, each library could simply have its own clang_getCheckers function, rather than clang_getCheckers_PLUGIN, which has potential name clash issues. But maybe that's a problem for later.)

Jordy

On Jul 29, 2011, at 17:53, Ted Kremenek wrote:

> Hi Jordy,
> 
> I'm still forming some thoughts on this.  My initial impression while I think this is a good direction, it really does feel overly complicated, particularly with the use of clang_getCheckersForPackage_PACKAGE functions.
> 
> I'm most mixed about the CheckerPackage abstraction.  How would it actually be used?  It seems to me that:
> 
> (a) First a plugin must be loaded, e.g. using llvm::sys::DynamicLibrary.
> 
> (b) Then the CheckerPackage::create() function is called to register checkers, but this seems to require a priori knowledge of the package name.
> 
> In your code snippet below, it's not clear to me where (b) lives.  Does it live in the plugin?  If so, this seems like a lot of indirection, when we could possibly instead use the plugin registry mechanism instead to find the CheckerManager and then register the checkers.  What's the benefit of the clang_getCheckersForPackage abstraction?
> 
> The mechanism I had more in mind was that a user writes a plugin, which links against clang and the static analyzer libraries.  It uses the llvm::Registry mechanism to load checkers using static constructors when the plugin is loaded.  This would be very simple, akin to what we can do with optimization passes:
> 
> http://llvm.org/docs/WritingAnLLVMPass.html#registering
> 
> In that example, we have RegisterRegAlloc, but instead we could have something like "RegisterCheckers".  It could take as an argument a function that returns a factory object (perhaps CheckerPackage?) that allows one to instantiate checkers given a CheckerManager.  The key thing to keep in mind is that his simple approach is very flexible.  One could possibly register several checkers into different packages.  We can also probably decouple checker registration from package registration.  Packages are just groups; when a checker gets loaded, it could just declare that it goes into a specific package.  If that group doesn't exist, the CheckerManager just creates one with the default visibility.  If the package gets explicitly registered (using a similar mechanism, e.g. "RegisterCheckerPackage"), that registration could provide enabled/disabled information for that package, visibility, and strings to describe that package.
> 
> In this mode, we get a nice decoupling.  Checker writers only have to declare the package the checker goes into by providing a package string.  They can also provide strings that describe the checker, which is then used by the registration mechanism.  Packages can be managed and registered separately.
> 
> This simpler, declarative approach also resolves the -print-help problem.  Instead of -print-help, I would expect we move in a direction where checkers just provide descriptive strings, and the help is formatted by whoever cares about it.  Checkers, and packages, should not get in the business of printing out anything.  That should be up to the client of the CheckerManager.
> 
> Anyhow, I'm stilling mulling this over, but I think we want to keep this simple and flexible.
> 
> What do you think?
> 
> On Jul 27, 2011, at 1:48 PM, Jordy Rose wrote:
> 
>> Okay, I've come up with a naming-based version of this, probably not final but working fairly well. The plugin client's job is fairly simple:
>> 
>> using clang::ento::CheckerInfo;
>> extern "C" clang_getCheckersForPackage_PACKAGE (CheckerInfo::List &out) {
>> // First include checkers from subpackages.
>> clang_getCheckersForPackage_SUBPACKAGE1(out);
>> clang_getCheckersForPackage_SUBPACKAGE2(out);
>> CheckerInfo::markAllSubpackage(out);
>> 
>> // Then add checkers from this package.
>> out.push_back(CheckerInfo::create<MyChecker>("MyChecker",
>>   "short description"));
>> // The name of the checker class does not have to match the name the
>> // client uses to enable your checker.
>> out.push_back(CheckerInfo::create<MyOtherChecker>("OtherChecker",
>>   "short description"));
>> // "Hidden" checkers are only enabled if they or their package is
>> // explicitly requested, not if they are in a subpackage.
>> out.push_back(CheckerInfo::create<HiddenChecker>("HiddenChecker",
>>   "short description", true));
>> }
>> 
>> The use of a clang_getCheckersForPackage_PACKAGE function means that we could eventually migrate all internal checkers to this format as well.
>> 
>> The main thing that this doesn't handle is that -analyzer-checker-help doesn't know what packages a plugin might have, so it just uses the basename of the plugin you're loading. The next step would be to look for a clang_checkerPackages_PLUGIN array that contains the names of any packages in PLUGIN.dylib. (Or maybe libPLUGIN.dylib.)
>> 
>> Where should I be going with this?
>> Jordy
>> 
>> <CheckerPlugins.3.patch>
>