[cfe-dev] [PATCH] Implement a sane plugin API for clang

Tue Mar 19 10:55:38 PDT 2013

On 3/16/2013 12:14 AM, Sean Silva wrote:
>
> On Sat, Mar 16, 2013 at 12:32 AM, Joshua Cranmer <pidgeot18 at gmail.com 
> <mailto:pidgeot18 at gmail.com>> wrote:
>
>     On 3/15/2013 11:01 PM, Sean Silva wrote:
>>     The clang driver already forks at least one process per compiler
>>     invocation. Your comments apply equally to that and I don't see
>>     anybody running to fix them (or even complaining), so I'm not
>>     convinced that this is really as significant of an issue as you
>>     make it out to be.
>
>     If I were to add a command in my Makefiles to spawn a $(shell),
>     the reviewers would throw a hissy fit. I'll also point out that
>     Clang is not a widely-used compilers on Windows systems, where
>     this kind of stuff matters more.
>
>
> I'm not sure how I got derailed, but for the plugins there is no 
> "wrapper script", just `clang-plugin-config myPlugin.so arg1 arg2` 
> once at configure time. (the wrapper script would be for harvesting 
> the compile commands, but that's a separate discussion and I'm not 
> sure how it got brought into this one).
>
>>         Also, you have to find more binaries to run it: if I specify
>>         CXX via a path, how should a build system know where to run
>>         clang-plugin-config from? You could guess by looking up the
>>         dirname of CXX and hoping it's there,
>>
>>
>>     I'm not sure I follow your point here. I image
>>     clang-plugin-config and the wrapper to be installed next to clang
>>     and be looked up/executed as usual.
>>
>>         but you are also advocating using shell scripts to represent
>>         CXX in another email, which renders this approach impossible.
>>
>>
>>     I also don't see the connection with my suggestion in the other
>>     email. In fact, the wrapper script for plugins and the
>>     compile_commands.json harvester could probably be the same
>>     script, and at configure time clang-plugin-config (or, perhaps
>>     better just `clang-config` now that it is going beyond plugins)
>>     would arrange for the wrapper script to perform the requested
>>     actions.
>
>     The logic I currently use to look up llvm-config for building the
>     plugin is as follows:
>     if test -z "$CXX"; then
>       CXX=`which clang++`
>     fi
>     if test -z "$LLVMCONFIG"; then
>       LLVMCONFIG=`which llvm-config`
>     fi
>     if test -z "$LLVMCONFIG"; then
>       LLVMCONFIG=`dirname $CXX`/llvm-config
>     fi
>
>     The ideas is to try to make this "just work" if the compiler to be
>     used is clang. However, if CXX is a shell script and clang is not
>     specifically in PATH (the latter case is not an esoteric
>     situation--it's how our own builders get to clang), then the value
>     returned is wrong. It's also wrong if people start using clang
>     with versioning numbers: consider clang symlinked to a clang-3.2,
>     but you're building with clang-3.3. Looking up llvm-config in the
>     path would find the llvm-config for 3.2 here instead of 3.3, which
>     would be wrong. IMHO, gcc's -print-file-name=plugin is much better
>     (you don't need to guess at the locations of other tools!).
>
>
> Sorry I was confusing myself earlier. As I said before the 
> "clang-plugin-config" runs once at configure time. Let's keep the 
> discussion of the pros/cons of the wrapper script in the other thread.
>
>>>         If you really want to immediately push plugins forward in a
>>>         big way, it would be monumental to set up a buildbot that
>>>         runs a clang plugin that does extra checking that isn't
>>>         really appropriate for being integrated as a diagnostic into
>>>         the compiler proper. For example, a plugin that warns on
>>>         incorrect uses of dyn_cast<>. For maximum effect this should
>>>         be developed in-tree (probably in clang-tools-extra. Even
>>>         though it has "tools" in the name, I don't think anybody
>>>         would be opposed to developing plugins in there). It should
>>>         also have an easy way for people in our community to come up
>>>         with and implement good extra checks and get them integrated
>>>         into that buildbot.
>>
>>         I am working on adding a compiler static checker plugin to
>>         Mozilla that would check the guarantees our old dehydra
>>         plugin used to check: a "must override" annotation (all
>>         subclasses must provide their own implementation of this
>>         method), a "stack class" annotation (this class cannot be
>>         allocated from the heap), and a warning that gets emitted
>>         every time you emit a static initializer.
>>
>>
>>     Awesome. Please keep us up to date with this work. Some of these
>>     checks seem like they could be relevant to llvm/clang too.
>
>     The biggest stumbling block to implementing useful checkers is the
>     inability to add custom annotations... annotate(string) is
>     currently being used as a hack, but what is really needed is the
>     ability to specify custom C++11 attributes. Actually committing
>     the static checker can be found in
>     <https://bugzilla.mozilla.org/show_bug.cgi?id=767563>
>     <https://bugzilla.mozilla.org/show_bug.cgi?id=767563>, but there
>     is a long list of desired analyses here
>     <https://bugzilla.mozilla.org/buglist.cgi?list_id=6034150&resolution=---&query_format=advanced&component=Rewriting%20and%20Analysis&product=Core>
>     <https://bugzilla.mozilla.org/buglist.cgi?list_id=6034150&resolution=---&query_format=advanced&component=Rewriting%20and%20Analysis&product=Core>.
>
>
>>>             The changes in this patch retain almost all of the same
>>>             functionality as the original plugin approach (including
>>>             the ability to do things like add custom compile passes
>>>             via the RegisterPass statics) while wrapping it in a
>>>             much saner wrapper.
>>>
>>>
>>>         My opposition to the current patch is that it does not
>>>         provide enough value to our users to compensate for the
>>>         inconvenience that it will cause them (by breaking their
>>>         code). My opposition is not technical; I don't doubt that
>>>         your approach here is an improvement from a purely technical
>>>         standpoint.
>>
>>         The current plugin approach presumes that it is a pure
>>         consumer of the AST, which isn't a viable option in my
>>         opinion. One thing I would like to do in the future is be
>>         able to map Decls in the AST to functions emitted in the LLVM
>>         IR, which is completely impossible under the current
>>         architecture. Note also that I'm not removing the current
>>         (more or less broken) plugin architecture, so I'm not
>>         compelling people to switch.
>>
>>
>>     You did delete the only code (PrintFunctionNames) in tree that
>>     AFAIK tests the previous functionality, which I interpreted as
>>     meaning that it was dead to you.
>
>     The old API I consider deprecated, but deprecated does not mean
>     imminent removal. Also, the examples directory isn't built by
>     default, so I doubt it's actually really being tested.
>
>>         Rather, this is about enabling future changes that permit
>>         plugins to not take the view that they happen independently
>>         of code generation.
>>
>>     This did not get through to me from the OP. Could you explain how
>>     the design you implement in this patch achieves that? It should
>>     be the emphasis of the review (and IMHO warrants a "does this
>>     direction and implementation approach sound good to everyone"
>>     cfe-dev discussion before proposing code to be committed).
>
>     If you pay careful attention, you'd notice that the plugins are
>     kept around in the CompilerInstance object, which is passed around
>     to all the AST actions, including the CodeGen AST action; the old
>     plugin API stores everything as separate AST actions and instead
>     multiplexes all the AST actions together, so the CodeGen AST
>     action is unaware of the existence of plugins, short of creating
>     Yet Another Static Initializer attachment point.
>
>
> Is there any way we could fuse the old functionality into the new 
> functionality?

The present entry point is a fundamentally different registration 
functionality from the new one, both in terms of how a plugin specifies 
the entry point and in terms of how clang is invoked to load the 
plugins--it's not possible to run them the same way and keep the same 
semantics. It is possible to make #if-guarded wrappers to switch between 
the two APIs for the plugin code in the common case however, but it's 
not something that can automatically be done.

>
>>     Also, the command line parsing stuff should be in a separate
>>     patch, and IMO the -fplugin should be just a driver arg: that
>>     way, the previous commandline args for plugins (directly via cc1)
>>     remains in a live code path.
>
>     That isn't feasible here, as the two plugin loading paths are
>     actually doing rather different things, and I don't think it is
>     desirable to attempt to merge what are conceptually different
>     models of plugins.
>
>
>>     As I said earlier, the compatibilty stuff also deserves a rehash,
>>     since I'm still not convinced that it is really useful.
>     The primary purpose of compatibility checking is to detect a
>     situation that would almost certainly lead to crash instead of
>     crashing. Users deserve to get useful error messages instead of
>     panicking crash dumps.
>
>
> It does nothing against subtle memory corruption, which is the real 
> issue. A crash is a *best-case scenario* and is not a priority to 
> protect against IMO. Blatant crashes are easy to catch and remedy. 
> Regardless, it is deceptive to have something like 
> DECLARE_PLUGIN_COMPATIBILITY() which still permits silent corruption 
> to happen, even though the user thinks they are "protected".
>
> The only satisfactory solution that I can think of is to have a 
> special configure option that generates and embeds a unique ID 
> (sha1sum of all the headers?) for a specific build which guarantees 
> compatibility (it could be called ENABLE_PLUGINS, with the option 
> ENABLE_UNSAFE_PLUGINS available to imitate the current plugin 
> situation). ENABLE_PLUGINS not be on by default (presumably it would 
> have some impact on build time), but we could document that when 
> preparing binary packages of clang that clang should be built with 
> this flag.

That isn't a perfect solution: it does nothing against guaranteeing, 
e.g., that clang and the plugin are using ABI-compatible standard 
libraries. But if we hold new APIs hostage to perfect solutions, then 
nothing will get added. What I want to protect against is what I believe 
would be the most common mistake: someone will use a heuristic to figure 
out which version of clang to compile against, and that heuristic will 
be wrong. In such a scenario, the only case the current compatibility 
would fail is if someone has a revision of clang installed to their path 
but is using a different revision of clang (not in the path) from the 
same branch to build--at which point I am happy to say "you're on the 
bleeding edge, so you should expect to bleed every now and then".

-- 
Joshua Cranmer
News submodule owner
DXR coauthor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130319/179a8651/attachment.html>