[PATCH] Implement a sane plugin API for clang

Jordan Rose jordan_rose at apple.com
Fri May 3 16:18:44 PDT 2013


Finally getting the chance to look at this. I think I'm convinced that it would be nice to have something cleaner and more flexible than just -load. I'm not going to get into the politics of it, though. 

>> We really ought to have a preliminary discussion on cfe-dev to map out the directions we want to take the plugin API, and what *compelling*, *enabling* changes we can make to it.

While I mostly like what you've come up with, Sean has a point that the community hasn't really picked it up. I have your e-mail from a year ago still flagged ("[cfe-dev] Thinking about clang plugins"). No one responded to it then, but I think it was a good summary of the problems you were/are trying to solve. I would say to bring that up again...except most likely nothing will happen until after the 3.3 release (and for us Apple folks, after WWDC).


> I have often heard that the static analyzer in Clang should be remodeled as a plugin instead of its current approach, but I'm not exactly sure what it was meant by that.

I'm at least part of this, particularly before joining Apple (where it's not so interesting because our version of the compiler is stripped). The idea was that there'd be less coupling between the analyzer and the compiler, potentially reducing build dependencies and enabling you to rebuild one without the other during development. It would also keep the size of the monolithic clang binary down.

But, we're not likely to do it any time soon, even if plugins do get easier and more accessible. It doesn't fit our current distribution model. Renato's "just don't build the analyzer" patch is probably good enough for most people.


Anyway, comments:

> +Clang Plugins run a set of callbacks over code. These callbacks are set up by
> +the plugin when the compiler calls a function exported by the plugin. The main
> +function exported by the plugin is clang_plugin_begin_tu, which is called before
> +any file is compiled.

s/any/each/, I think. Is there a separate callback that is guaranteed to only be called once, or at least once per CompilerInstance?


> +  void clang_plugin_on_tu(const plugin::PluginInfo &Info,
> +                          llvm::StringRef FileName,
> +                          const CompilerInstance &CI,
> +                          plugin::TUCallbacks &Callbacks) {

No need for "llvm::"


> -    #include "clang/Frontend/FrontendPluginRegistry.h"
> +    #include "clang/Basic/Plugin.h"
> +    #include "clang/Basic/Version.h"
>      #include "clang/AST/ASTConsumer.h"
>      #include "clang/AST/AST.h"
> -    #include "clang/Frontend/CompilerInstance.h"
>      #include "llvm/Support/raw_ostream.h"

We generally try to keep this alphabetical, even though Basic is more basic than AST.


>  Running the plugin
>  ==================
> + asdf*

Huh? ;-)


> These options do not require the use of the :option:`-cc1` or :option:`-Xclang` options to work properly.


I realize this is a nice advantage over the old-style plugins, but starting from scratch it's not interesting.

Also, 80 columns? (here and elsewhere)


> +For example, to run the ``print-function-names`` plugin over a source file named``example.cpp``

Missing space before "``example.cpp``"


> -  $ clang++ -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS \
> -            -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D_GNU_SOURCE \
> -            -I$BD/tools/clang/include -Itools/clang/include -I$BD/include -Iinclude \
> -            tools/clang/tools/clang-check/ClangCheck.cpp -fsyntax-only \
> -            -Xclang -load -Xclang $BD/lib/PrintFunctionNames.so -Xclang \
> -            -plugin -Xclang print-fns
> +  $ clang++ -fplugin=$BD/lib/PrintFunctionsNames.so example.cpp

Oh gosh...more than half the cruft is just because the old tutorial wanted to use Clang itself as the content.


> +//===--- Plugin.h -----------------------------------------------*- C++ -*_===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//

This file could use at least a basic \file comment below the license.


> +/// A struct containing sufficient information to express ABI compatibility for
> +/// plugins. It is sufficient for plugins to use the
> +/// DECLARE_PLUGIN_COMPATIBILITY macro to find compatibility.
> +struct Version {
> +  /// The major version of clang (corresponds to CLANG_VERSION_MAJOR)
> +  unsigned Major;
> +  /// The minor version of clang (corresponds to CLANG_VERSION_MINOR)
> +  unsigned Minor;
> +};

VersionTuple is a bit overkill here, but do we really need another version struct?


> +  llvm::ArrayRef<std::pair<std::string, std::string> > Arguments;

Yuck at this type, but okay. You should just include "clang/Basic/LLVM.h" to drop the "llvm::", though.


> All non-null callbacks will be deleted by the compiler internally.

"On return, the compiler takes ownership of any non-null callbacks."

...although I'm not sure this is a good idea. Why not reuse the same callbacks over multiple TUs?


> +/// \arg Info      The invocation environment of the plugin.
> +/// \arg FileName  The name of the translation unit being compiled.
> +/// \arg CI        The compiler instance, which contains information about how
> +///                the file is being compiled and other utility classes.
> +/// \arg Callbacks The callbacks that this plugin will define for this file.

The correct Doxygen command is \param. \arg is just \li.


> +  /// \param ErrorMsg [out] The reason why plugin loading failed.

\param[out] (no space) will do the right thing for Doxygen.


> +    /// Arguments to the plugin
> +    SmallVector<std::pair<std::string, std::string>, 4> Args;

Should this just be a std::vector? I think the most common case will be 0 args, in which case SmallVector doesn't do much good.


> +  if (!Library.isValid()) {
> +    return false;
> +  }

Spurious braces.


> +  if (!Compatibility) {
> +    ErrorMsg = "Could not find symbol clang_plugin_compatibility";
> +    return false;
> +  } else if (Compatibility->Major != CLANG_VERSION_MAJOR ||
> +             Compatibility->Minor != CLANG_VERSION_MINOR) {
> +    ErrorMsg = "Plugin is not compatible with this version of clang";
> +    return false;
> +  }

Just make this two separate if-statements, I think. LLVM likes early returns.


> +  for (SmallVector<Plugin, 4>::iterator it = Plugins.begin();
> +       it != Plugins.end(); ++it) {

Please don't call end() every time through the loop. Also, even if you do stick to SmallVector, you can use SmallVectorImpl as your base type to avoid specifying the length everywhere.


> +  // it to us as -fplugin-<full fused string>, so we need to split these up as
> +  // appropriate.

-fplugin-arg-<full fused string>

And...

> +  // Parse the plugin arguments. These are of the form
> +  // -fplugin-<plugin name>-<arg name>[=<value>]. The command line parser gives
> +  // it to us as -fplugin-<full fused string>, so we need to split these up as
> +  // appropriate.
> +  SmallVector<std::pair<std::string, std::string>, 4> ParsedPluginArgs;
> +  for (arg_iterator it = Args.filtered_begin(OPT_fplugin_arg_),
> +         end = Args.filtered_end(); it != end; ++it) {
> +    // Split the argument into <name>-<arg>
> +    StringRef FusedArg = (*it)->getValue();
> +    size_t Boundary = FusedArg.find('-');
> +    if (Boundary == std::string::npos) {
> +      Diags.Report(diag::err_drv_unknown_argument) << FusedArg;
> +      continue;
> +    }
> +    ParsedPluginArgs.push_back(std::make_pair(FusedArg.substr(0, Boundary),
> +                                              FusedArg.substr(Boundary + 1)));
> +  }
> +
> +  // Add the plugin arguments to a ParsedPluginArgs
> +  for (int I = 0, E = AllPlugins.size(); I != E; ++I) {
> +    StringRef PluginPath = AllPlugins[I];
> +    Opts.CompilerPlugins[I].Plugin = PluginPath;
> +    StringRef BaseName = llvm::sys::path::stem(PluginPath);
> +    // Collect all of the plugins that correspond to this plugin.
> +    for (SmallVector<std::pair<std::string, std::string>, 4>::iterator
> +           PluginArg = ParsedPluginArgs.begin(), End = ParsedPluginArgs.end();
> +         PluginArg != End; ++PluginArg) {
> +      if (PluginArg->first == BaseName) {
> +        // ArgAssign is in the form "name=value" or just plain "name", now we
> +        // need to break it up into the form name, value pair.
> +        StringRef ArgAssign = PluginArg->second;
> +        size_t Boundary = ArgAssign.find("=");
> +        // Treat no '=' as an arg with no value
> +        if (Boundary == std::string::npos)
> +          Opts.CompilerPlugins[I].Args.push_back(std::make_pair(ArgAssign, ""));
> +        else
> +          Opts.CompilerPlugins[I].Args.push_back(std::make_pair(
> +            ArgAssign.substr(0, Boundary), ArgAssign.substr(Boundary + 1)));
> +      }
> +    }
> +  }

...any reason not to do this with a map, to avoid N iterations over the arguments?

Jordan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130503/dd61ef18/attachment.html>


More information about the cfe-commits mailing list