[cfe-commits] [PATCH] Implement a better plugin API for clang, v3

Joshua Cranmer pidgeot18 at gmail.com
Mon Jul 16 19:50:45 PDT 2012


On 7/16/2012 7:25 PM, Sean Silva wrote:
> Sorry this is rather long.
>
> First off, are you assuming that people who write plugins are "clang
> hackers" (e.g., they are definitely on the mailing list, probably have
> committed to clang, etc.)? I think this is too stringent. I think
> plugins would make a good "first brush" with clang. With the addition
> of the AST matchers, it is pretty reasonable to be able to do useful
> stuff without being a "clang hacker". In fact, the tooling docs
> <http://clang.llvm.org/docs/Tooling.html> basically flat-out say that
> plugins are targeted at users who are almost by definition *not* clang
> hackers; e.g. one of the "canonical examples" it gives is "special
> lint-style warnings or errors for **your** project [emphasis
> mine]"---the assumption is that the clients of plugins are primarily
> hackers on some other project. Hence, I think the plugin API should be
> targeted at being able to be useful with just a proverbial `sudo
> apt-get install clang`; if this is technically possible, then it
> should be done, as it increases the usefulness/audience of clang
> plugins by _at least_ an order of magnitude.

I definitely want plugins to be on the easier side to implement. I also 
don't want the perfect to be the enemy of the good here--I've been 
working on versions of this patch for almost a year.  I think using AST 
matchers is desirable here, but remember that I wrote this patch before 
they were committed. This API is useful enough to be committed on its 
own (IMO) without adding extra plugin-AST matcher goodies.

Just apt-get install clang won't be enough IMO--note that Debian uses 
gcc-plugin-dev for GCC's plugin API headers (and clang-dev for Clang).
> +  // It is dangerous to use plugins with versions that it wasn't compiled for.
> +  // This causes us to bail out before we can do things like crash clang.
> +  if (version->major != CLANG_VERSION_MAJOR ||
> +      version->minor != CLANG_VERSION_MINOR)
> +    return false;
>
> Maybe this checking can be turned upside down, where the plugin
> exports the version it is compiled for, and clang does the checking.
> That would allow for nice clang-style diagnostics and would guarantee
> that the check is always performed. Regardless, there has to be a way
> to guarantee that a diagnostic is emitted instead of just silently not
> working.

The plugin loading code emits a diagnostic if loading the plugin 
(including failing the version check) fails.
> At a different level, this check doesn't seem meaningful since the
> plugin is presumably going to use other APIs inside clang, and hence
> the interfaces are only stable within a single revision number. Maybe
> this is intended as just a "sanity check"? If so, it should be
> documented as such. I think it would be wise to check against the
> exact revision number in order to guarantee that the plugin won't
> spontaneously crash or misbehave.

I copied the version stuff from GCC's plugin API. There definitely needs 
to be a check to guard against hard ABI incompatibility. The advantage 
of having the plugin do the check itself is that there might be some 
valid use cases where a single plugin is ABI-compatible with multiple 
versions of Clang, though I agree that (for now, at least) these are 
very marginal use cases and probably not worth supporting.
> +passed with -fplugin-arg-<plugin-name>-<arg>=<value> or
> +-fplugin-arg-<plugin-name>-<arg>. These options work the same when run
>
> I don't think you have yet defined what exactly the "plugin name" is.
> Without that, this description is not too meaningful.
>
> +-fplugin-arg-<plugin-name>-<arg>. These options work the same when run
> +both with the normal driver and the cc1 process.</p>
>
> I suspect that a lot of people who might want to write plugins aren't
> aware of the distinction between the `driver` and the `cc1 process`.
> This seems irrelevant to the new plugin API anyways since it "just
> works" with the driver (does anybody ever actually directly invoke
> cc1?). Maybe note it as a comparison with the previous API.

The reworded text now looks like:
<p>To run a plugin, you merely need to specify the plugin to be loaded 
using the
-fplugin command line options. Additional parameters for the plugins can be
passed with -fplugin-arg-<plugin-name>-<arg>=<value> or
-fplugin-arg-<plugin-name>-<arg>. The plugin name is the basename 
of the
file, so for a plugin whose library is <tt>libplugin.so</tt>, the name 
to use
would be <tt>libplugin</tt>. These options do not require the use of the
<tt>-cc1</tt> or <tt>-Xclang</tt> options to work properly.</p>

>     $ clang++ -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS \
>           -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D_GNU_SOURCE \
>           -I$BD/tools/clang/include -Itools/clang/include -I$BD/include
> -Iinclude \
>           tools/clang/tools/clang-check/ClangCheck.cpp -fsyntax-only \
>
> I think the need for this complexity seems to be that you are running
> it over one of the source file in the LLVM/Clang tree.
> Maybe you could just use Foo.cpp? Or show a simple file to them (e.g.
> "put this into a file Foo.cpp"). Regardless, I don't think that a
> developer should have to have LLVM/Clang checked out in order to write
> or use plugins, so there shouldn't be a reference to the LLVM/Clang
> source tree in the documentation.

I was aiming for minimal modification of the clang plugins directory 
here. It's very easy to discuss running it over a hypothetical source 
file named "example.cpp"
>
> + * Plugin arguments are specified as -fplugin-arg-<plugin>-<name>=<value> or
> + * -fplugin-arg-<plugin>-<name> attributes. In the latter case, the value
> + * attribute would be an empty string.
>
> I don't think it is entirely clear from context what <plugin> and
> <name> should be here.
The name/value are reusing the names of the fields in the struct.

>
> +/**
> + * A struct representing the version of clang.
> + *
> + * This version is different from other accessors that clang exposes
> in order to
> + * retain ABI compatibility.
> + */
> +struct Version {
> +  /// The major version of clang (corresponds to CLANG_VERSION_MAJOR)
> +  unsigned major;
> +  /// The minor version of clang (corresponds to CLANG_VERSION_MINOR)
> +  unsigned minor;
> +  /// The patch level version of clang, if present. This value shouldn't affect
> +  /// API or ABI-level compatibility with clang for most uses. This corresponds
> +  /// to CLANG_VERSION_PATCHLEVEL if it is defined.
> +  unsigned subminor;
> +};
>
> what is the relationship between this and `include/clang/VersionTuple.h`?

I wanted to make sure that the first function called was ABI/API-frozen. 
Reusing VersionTuple.h exposes plugins to the risk that developers 
decide to modify the VersionTuple ABI and API.
> +// The following functions are not implemented within clang itself, but are
> +// prototypes of the various functions that may be implemented by plugins.
> +// Including it in a header file allows these hooks to be documented with
> +// doxygen, and the use of extern "C" on declarations allows people who forget
> +// to define these functions with extern "C" to still have them work.
>
> Maybe this should somehow show up in the doxygen documentation?
> Particularly "The following functions are not implemented within clang
> itself"; things like this that violate users assumptions ("if it's in
> the doxygen docs, then it is implemented by clang") need to be clearly
> marked as such.
It's documenting a group of functions; I'm not entirely sure how to do 
this in doxygen.
>
> +/// Before this function is called, the callbacks object will have all of its
>
> Here you assume that the reader knows that "callbacks object" is
> `PluginFileCallbacks`; is this a valid assumption? I had to "figure it
> out". Save the reader the hassle and just say "the passed-in
> PluginFileCallbacks object".

I originally had the argument name of callbacks lower-case; I forgot to 
change the documentation after renormalizing the capitalization.

Now for the API comments. To put it simply, I'm still struggling to come 
up with a decent API. One problem I discovered after writing the patch 
is that the design of the driver architecture means that if your command 
line is:
$ clang a.c b.c
the plugin will be loaded twice, and the start/end file methods only 
called once per "global" instantiation, instead of being initialized 
once and having the start/end methods be called once. On the other hand, 
I think there are good use cases for letting plugins modify in 
particular the linker invocation and possibly add passes in the LTO 
plugin, which means that the same plugin might be loaded by three 
distinct processes. This is definitely counterintuitive to how most 
people would expect things to work if they don't realize that the 
command line actually invokes several processes, but I'm not sure the 
best way to remedy the situation. Ripping out the begin file/end file 
methods might be for the best.

I've also found that growing the monolithic PluginCallbacks struct seems 
to work best in the long run. One feature that I think is necessary for 
plugins is to require plugins to only have to do things for the 
callbacks they want to use: if we assume that projects will be building 
plugins to run on their codebase, then we need some moderate API 
stability, since tying projects to a single version of Clang is not 
exactly tenable. For that reason, I'm disinclined to use a class with 
virtual methods.

Thoughts?

-- 
Joshua Cranmer
News submodule owner
DXR coauthor




More information about the cfe-commits mailing list