[PATCH] Implement a sane plugin API for clang

Fri Mar 15 22:14:35 PDT 2013

On Sat, Mar 16, 2013 at 12:32 AM, Joshua Cranmer <pidgeot18 at gmail.com>wrote:

>  On 3/15/2013 11:01 PM, Sean Silva wrote:
>
> The clang driver already forks at least one process per compiler
> invocation. Your comments apply equally to that and I don't see anybody
> running to fix them (or even complaining), so I'm not convinced that this
> is really as significant of an issue as you make it out to be.
>
>
> If I were to add a command in my Makefiles to spawn a $(shell), the
> reviewers would throw a hissy fit. I'll also point out that Clang is not a
> widely-used compilers on Windows systems, where this kind of stuff matters
> more.
>
>
I'm not sure how I got derailed, but for the plugins there is no "wrapper
script", just `clang-plugin-config myPlugin.so arg1 arg2` once at configure
time. (the wrapper script would be for harvesting the compile commands, but
that's a separate discussion and I'm not sure how it got brought into this
one).

>
>
>>  Also, you have to find more binaries to run it: if I specify CXX via a
>> path, how should a build system know where to run clang-plugin-config from?
>> You could guess by looking up the dirname of CXX and hoping it's there,
>>
>
>  I'm not sure I follow your point here. I image clang-plugin-config and
> the wrapper to be installed next to clang and be looked up/executed as
> usual.
>
>  but you are also advocating using shell scripts to represent CXX in
>> another email, which renders this approach impossible.
>>
>
>  I also don't see the connection with my suggestion in the other email.
> In fact, the wrapper script for plugins and the compile_commands.json
> harvester could probably be the same script, and at configure time
> clang-plugin-config (or, perhaps better just `clang-config` now that it is
> going beyond plugins) would arrange for the wrapper script to perform the
> requested actions.
>
>
> The logic I currently use to look up llvm-config for building the plugin
> is as follows:
> if test -z "$CXX"; then
>   CXX=`which clang++`
> fi
> if test -z "$LLVMCONFIG"; then
>   LLVMCONFIG=`which llvm-config`
> fi
> if test -z "$LLVMCONFIG"; then
>   LLVMCONFIG=`dirname $CXX`/llvm-config
> fi
>
> The ideas is to try to make this "just work" if the compiler to be used is
> clang. However, if CXX is a shell script and clang is not specifically in
> PATH (the latter case is not an esoteric situation--it's how our own
> builders get to clang), then the value returned is wrong. It's also wrong
> if people start using clang with versioning numbers: consider clang
> symlinked to a clang-3.2, but you're building with clang-3.3. Looking up
> llvm-config in the path would find the llvm-config for 3.2 here instead of
> 3.3, which would be wrong. IMHO, gcc's -print-file-name=plugin is much
> better (you don't need to guess at the locations of other tools!).
>

Sorry I was confusing myself earlier. As I said before the
"clang-plugin-config" runs once at configure time. Let's keep the
discussion of the pros/cons of the wrapper script in the other thread.

     If you really want to immediately push plugins forward in a big way,
>> it would be monumental to set up a buildbot that runs a clang plugin that
>> does extra checking that isn't really appropriate for being integrated as a
>> diagnostic into the compiler proper. For example, a plugin that warns on
>> incorrect uses of dyn_cast<>. For maximum effect this should be developed
>> in-tree (probably in clang-tools-extra. Even though it has "tools" in the
>> name, I don't think anybody would be opposed to developing plugins in
>> there). It should also have an easy way for people in our community to come
>> up with and implement good extra checks and get them integrated into that
>> buildbot.
>>
>>
>>  I am working on adding a compiler static checker plugin to Mozilla that
>> would check the guarantees our old dehydra plugin used to check: a "must
>> override" annotation (all subclasses must provide their own implementation
>> of this method), a "stack class" annotation (this class cannot be allocated
>> from the heap), and a warning that gets emitted every time you emit a
>> static initializer.
>>
>>
>  Awesome. Please keep us up to date with this work. Some of these checks
> seem like they could be relevant to llvm/clang too.
>
>
> The biggest stumbling block to implementing useful checkers is the
> inability to add custom annotations... annotate(string) is currently being
> used as a hack, but what is really needed is the ability to specify custom
> C++11 attributes. Actually committing the static checker can be found in
> <https://bugzilla.mozilla.org/show_bug.cgi?id=767563><https://bugzilla.mozilla.org/show_bug.cgi?id=767563>,
> but there is a long list of desired analyses here
> <https://bugzilla.mozilla.org/buglist.cgi?list_id=6034150&resolution=---&query_format=advanced&component=Rewriting%20and%20Analysis&product=Core><https://bugzilla.mozilla.org/buglist.cgi?list_id=6034150&resolution=---&query_format=advanced&component=Rewriting%20and%20Analysis&product=Core>
> .
>
>
>
>>    The changes in this patch retain almost all of the same functionality
>>> as the original plugin approach (including the ability to do things like
>>> add custom compile passes via the RegisterPass statics) while wrapping it
>>> in a much saner wrapper.
>>>
>>>
>>  My opposition to the current patch is that it does not provide enough
>> value to our users to compensate for the inconvenience that it will cause
>> them (by breaking their code). My opposition is not technical; I don't
>> doubt that your approach here is an improvement from a purely technical
>> standpoint.
>>
>>
>>  The current plugin approach presumes that it is a pure consumer of the
>> AST, which isn't a viable option in my opinion. One thing I would like to
>> do in the future is be able to map Decls in the AST to functions emitted in
>> the LLVM IR, which is completely impossible under the current architecture.
>> Note also that I'm not removing the current (more or less broken) plugin
>> architecture, so I'm not compelling people to switch.
>>
>
>  You did delete the only code (PrintFunctionNames) in tree that AFAIK
> tests the previous functionality, which I interpreted as meaning that it
> was dead to you.
>
>
> The old API I consider deprecated, but deprecated does not mean imminent
> removal. Also, the examples directory isn't built by default, so I doubt
> it's actually really being tested.
>
>
>
>> Rather, this is about enabling future changes that permit plugins to not
>> take the view that they happen independently of code generation.
>>
>
> This did not get through to me from the OP. Could you explain how the
> design you implement in this patch achieves that? It should be the emphasis
> of the review (and IMHO warrants a "does this direction and implementation
> approach sound good to everyone" cfe-dev discussion before proposing code
> to be committed).
>
>
> If you pay careful attention, you'd notice that the plugins are kept
> around in the CompilerInstance object, which is passed around to all the
> AST actions, including the CodeGen AST action; the old plugin API stores
> everything as separate AST actions and instead multiplexes all the AST
> actions together, so the CodeGen AST action is unaware of the existence of
> plugins, short of creating Yet Another Static Initializer attachment point.
>
>
Is there any way we could fuse the old functionality into the new
functionality?

>
>   Also, the command line parsing stuff should be in a separate patch, and
> IMO the -fplugin should be just a driver arg: that way, the previous
> commandline args for plugins (directly via cc1) remains in a live code path.
>
>
> That isn't feasible here, as the two plugin loading paths are actually
> doing rather different things, and I don't think it is desirable to attempt
> to merge what are conceptually different models of plugins.
>
>
>   As I said earlier, the compatibilty stuff also deserves a rehash, since
> I'm still not convinced that it is really useful.
>
> The primary purpose of compatibility checking is to detect a situation
> that would almost certainly lead to crash instead of crashing. Users
> deserve to get useful error messages instead of panicking crash dumps.
>

It does nothing against subtle memory corruption, which is the real issue.
A crash is a *best-case scenario* and is not a priority to protect against
IMO. Blatant crashes are easy to catch and remedy. Regardless, it is
deceptive to have something like DECLARE_PLUGIN_COMPATIBILITY() which still
permits silent corruption to happen, even though the user thinks they are
"protected".

The only satisfactory solution that I can think of is to have a special
configure option that generates and embeds a unique ID (sha1sum of all the
headers?) for a specific build which guarantees compatibility (it could be
called ENABLE_PLUGINS, with the option ENABLE_UNSAFE_PLUGINS available to
imitate the current plugin situation). ENABLE_PLUGINS not be on by default
(presumably it would have some impact on build time), but we could document
that when preparing binary packages of clang that clang should be built
with this flag.

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130316/e6344fd3/attachment.html>