[cfe-commits] [PATCH] Implement a better plugin API for clang, v3

Sean Silva silvas at purdue.edu
Mon Jul 16 16:25:02 PDT 2012


Sorry this is rather long.

First off, are you assuming that people who write plugins are "clang
hackers" (e.g., they are definitely on the mailing list, probably have
committed to clang, etc.)? I think this is too stringent. I think
plugins would make a good "first brush" with clang. With the addition
of the AST matchers, it is pretty reasonable to be able to do useful
stuff without being a "clang hacker". In fact, the tooling docs
<http://clang.llvm.org/docs/Tooling.html> basically flat-out say that
plugins are targeted at users who are almost by definition *not* clang
hackers; e.g. one of the "canonical examples" it gives is "special
lint-style warnings or errors for **your** project [emphasis
mine]"---the assumption is that the clients of plugins are primarily
hackers on some other project. Hence, I think the plugin API should be
targeted at being able to be useful with just a proverbial `sudo
apt-get install clang`; if this is technically possible, then it
should be done, as it increases the usefulness/audience of clang
plugins by _at least_ an order of magnitude.

+use of these hooks, compilers can affect or observer the course of compilation

s/observer/observe/

+  // It is dangerous to use plugins with versions that it wasn't compiled for.
+  // This causes us to bail out before we can do things like crash clang.
+  if (version->major != CLANG_VERSION_MAJOR ||
+      version->minor != CLANG_VERSION_MINOR)
+    return false;

Maybe this checking can be turned upside down, where the plugin
exports the version it is compiled for, and clang does the checking.
That would allow for nice clang-style diagnostics and would guarantee
that the check is always performed. Regardless, there has to be a way
to guarantee that a diagnostic is emitted instead of just silently not
working.

At a different level, this check doesn't seem meaningful since the
plugin is presumably going to use other APIs inside clang, and hence
the interfaces are only stable within a single revision number. Maybe
this is intended as just a "sanity check"? If so, it should be
documented as such. I think it would be wise to check against the
exact revision number in order to guarantee that the plugin won't
spontaneously crash or misbehave.

+passed with -fplugin-arg-<plugin-name>-<arg>=<value> or
+-fplugin-arg-<plugin-name>-<arg>. These options work the same when run

I don't think you have yet defined what exactly the "plugin name" is.
Without that, this description is not too meaningful.

+-fplugin-arg-<plugin-name>-<arg>. These options work the same when run
+both with the normal driver and the cc1 process.</p>

I suspect that a lot of people who might want to write plugins aren't
aware of the distinction between the `driver` and the `cc1 process`.
This seems irrelevant to the new plugin API anyways since it "just
works" with the driver (does anybody ever actually directly invoke
cc1?). Maybe note it as a comparison with the previous API.

+void clang_plugin_begin_file(llvm::StringRef fileName,
+    const CompilerInstance &CI, plugin::PluginFileCallbacks *callbacks) {

This signature disagrees with the one in the header file
"PluginFileCallbacks &".

   $ clang++ -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS \
         -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D_GNU_SOURCE \
         -I$BD/tools/clang/include -Itools/clang/include -I$BD/include
-Iinclude \
         tools/clang/tools/clang-check/ClangCheck.cpp -fsyntax-only \

I think the need for this complexity seems to be that you are running
it over one of the source file in the LLVM/Clang tree.
Maybe you could just use Foo.cpp? Or show a simple file to them (e.g.
"put this into a file Foo.cpp"). Regardless, I don't think that a
developer should have to have LLVM/Clang checked out in order to write
or use plugins, so there shouldn't be a reference to the LLVM/Clang
source tree in the documentation.

+ * Plugin arguments are specified as -fplugin-arg-<plugin>-<name>=<value> or
+ * -fplugin-arg-<plugin>-<name> attributes. In the latter case, the value
+ * attribute would be an empty string.

I don't think it is entirely clear from context what <plugin> and
<name> should be here.

+/**
+ * A struct representing the version of clang.
+ *
+ * This version is different from other accessors that clang exposes
in order to
+ * retain ABI compatibility.
+ */
+struct Version {
+  /// The major version of clang (corresponds to CLANG_VERSION_MAJOR)
+  unsigned major;
+  /// The minor version of clang (corresponds to CLANG_VERSION_MINOR)
+  unsigned minor;
+  /// The patch level version of clang, if present. This value shouldn't affect
+  /// API or ABI-level compatibility with clang for most uses. This corresponds
+  /// to CLANG_VERSION_PATCHLEVEL if it is defined.
+  unsigned subminor;
+};

what is the relationship between this and `include/clang/VersionTuple.h`?

+// The following functions are not implemented within clang itself, but are
+// prototypes of the various functions that may be implemented by plugins.
+// Including it in a header file allows these hooks to be documented with
+// doxygen, and the use of extern "C" on declarations allows people who forget
+// to define these functions with extern "C" to still have them work.

Maybe this should somehow show up in the doxygen documentation?
Particularly "The following functions are not implemented within clang
itself"; things like this that violate users assumptions ("if it's in
the doxygen docs, then it is implemented by clang") need to be clearly
marked as such.

+/// Before this function is called, the callbacks object will have all of its

Here you assume that the reader knows that "callbacks object" is
`PluginFileCallbacks`; is this a valid assumption? I had to "figure it
out". Save the reader the hassle and just say "the passed-in
PluginFileCallbacks object".

+/// \arg fileName  The name of the function being called.
+/// \arg CI        The compiler instance, which contains information about how
+///                the file is being compiled and other utility classes.
+/// \arg callbacks The callbacks that this plugin will define for this file.

The capitalization is different from the names in the parameter list.

+/// Helper macro to call a given function in a plugin:
+/// CALL_PLUGIN_FUNCTION(plugin, clang_plugin_some_func, (arg1, arg2))
+#define CALL_PLUGIN_FUNCTION(plugin, fname, Args) \
+  do { \
+    CompilerInstance::Plugin::fname Func_ = \
+      (CompilerInstance::Plugin::fname)(plugin)->getPluginFunction(#fname); \
+    if (Func_) \
+      Func_ Args; \
+  } while (0)

This is kind of nasty. Why not make the plugin interface be just a
single function that returns a "struct clang_plugin_ops" with one
function pointer member for each of the `clang_plugin_*` functions?
Then you only need to get one symbol from the dynamic library, which
you can do upon loading so that you don't have to expose the dynamic
library stuff to the rest of the code. Also, this would be a good
place to embed the version information that the plugin is compiled
against (with regards to what I mentioned earlier about Clang doing
the version check).

Would returning a pointer to a class with a bunch of virtual functions
rather than a `struct clang_plugin_ops` be more comfortable for users?

+extern bool clang_plugin_init(int Argc, clang::plugin::PluginArg *Argv,
+                              clang::plugin::Version *Version);

There doesn't appear to be a way for this to communicate the
information it gets from the arguments to the rest of the plugin
except via a global. Is this intentional? If so, please document that
"the intended way for this function to communicate with the rest of
the plugin is through global variables".

+  /// Contains the data needed for a single plugin.
+  class Plugin {
+    /// The library that contains the plugin itself
+    mutable llvm::sys::DynamicLibrary PluginLibrary;
+
+    /// The structure that contains all of the callbacks for the library
+    mutable plugin::PluginFileCallbacks Callbacks;
+  public:
+    Plugin(llvm::sys::DynamicLibrary library) : PluginLibrary(library),
+                                                Callbacks() {}
+    plugin::PluginFileCallbacks &getCallbacks() const { return Callbacks; }
+    intptr_t getPluginFunction(const char *name) const {
+      return (intptr_t)PluginLibrary.getAddressOfSymbol(name);
+    }
+
+/// Helper macro to call a given function in a plugin:
+/// CALL_PLUGIN_FUNCTION(plugin, clang_plugin_some_func, (arg1, arg2))
+#define CALL_PLUGIN_FUNCTION(plugin, fname, Args) \
+  do { \
+    CompilerInstance::Plugin::fname Func_ = \
+      (CompilerInstance::Plugin::fname)(plugin)->getPluginFunction(#fname); \
+    if (Func_) \
+      Func_ Args; \
+  } while (0)
+
+  // These typedefs must have the same signatures as the ones in
+  // clang/Basic/Plugin.h, including the same name (it makes the helper macro
+  // above work properly).
+  typedef void (*clang_plugin_begin_file)(StringRef, const CompilerInstance &,
+                                          plugin::PluginFileCallbacks &);
+  typedef void (*clang_plugin_end_file)();
+  typedef void (*clang_plugin_destroy)();
+  };

This is awkward and forces a bunch of implementation knowledge to be
scattered all over the place. For example everywhere you have these
loops like
+  for (ArrayRef<CompilerInstance::Plugin>::iterator it =
CI.getPlugins().begin();
+       it != CI.getPlugins().end(); ++it)
+    CALL_PLUGIN_FUNCTION(it, clang_plugin_end_file, ());
There should be a single PluginManager class (or similar) that has one
public member function for each of the logical places that the
CompilerInstance interacts with plugins, and CompilerInstance holds
one of these as a member (say, `Plugins`). Then each of these places
turns into, for example, `Plugins.PerformEndOfFileCallbacks()`, and
then the PluginManager takes care of multiplexing the point of
interaction across all the loaded plugins. A ton of stuff which now
lives in headers can then be moved into PluginManager.cpp, including
all the details of how plugins are loaded and the symbol interface of
the dynamic libraries (currently, `clang_plugin_*` functions). Also,
PluginManager.h/PluginManager.cpp will be a good place to document how
to extend the plugin system to allow a richer interaction with clang
(an important piece of documentation which is completely absent from
your current patch).

That's all for now, can't wait to see v4. This new plugin API is going
to be sweet! Thanks for your hard work!

--Sean Silva

On Mon, Jul 16, 2012 at 10:40 AM, Joshua Cranmer <Pidgeot18 at gmail.com> wrote:
> On 7/5/2012 4:07 PM, Joshua Cranmer wrote:
>> Implement a better plugin API for clang, v3
>>
>> In this version, I've moved from using a wrapper action (which emulated the old
>> plugin model more closely) to installing plugins directly to the compiler
>> instance, which is better for adding future extension hooks, like custom
>> attributes.
>
>
> Review ping ?
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits



More information about the cfe-commits mailing list