[cfe-dev] Question about custom Clang static analyzer checker registration in clang-tidy without using plugin

Fri Mar 16 14:31:34 PDT 2018

Friendly ping on this.

----- Original Message -----
From: <likan_999.student at sina.com>
To: "Artem Dergachev" <noqnoqneo at gmail.com>, "Alexander Kornienko" <alexfh at google.com>,
Cc: "George Karpenkov" <ekarpenkov at apple.com>, "clang developer list" <cfe-dev at lists.llvm.org>,
Subject: Re: Re: [cfe-dev] Question about custom Clang static analyzer checker registration in clang-tidy without using plugin
Date: 2018-03-15 05:24

For the George's question on how we did it for clang-tidy. We make use of ClangTidyModuleRegistry: https://github.com/llvm-mirror/clang-tools-extra/blob/master/clang-tidy/ClangTidyModuleRegistry.h
Basically we have a separate static library that registers our internal clang-tidy checkers with ClangTidyModuleRegistry (which essentially creates a global variable whose constructor does the registration work). In this way we can simply modify our build rules to link clang-tidy with this library, without modifying the clang source tree.
For Artem's remarks, since there are a bunch of places in Clang that make use the register mechanism, can we say #2 might not be super against Clang's coding style/conventions, if we use the Clang builtin registration mechanism? I feel this is the best solution from our side. We can use the same mechanism that is used for clang-tidy to register checkers for CSA, without modifying the Clang source tree. 
For approach #1, there are two things that needs to be solved before we can go with this approach. First is how to pass in additional checkers (or checker factories) without breaking existing API. By saying existing API I mean ento::CreateAnalysisConsumer, ClangTidyASTConsumerFactory, runClangTidy, etc.  Second, even if we come up with some API that allows passing in checkers, we can't do it without modifying Clang source tree. We still need to modify clang-tidy source files to actually pass in our custom checkers.
For approach #3, I agree this is not the best approach and it doesn't work on all platforms.
What do you think?
----- Original Message -----
From: Artem Dergachev <noqnoqneo at gmail.com>
To: Alexander Kornienko <alexfh at google.com>, likan_999.student at sina.com
Cc: George Karpenkov <ekarpenkov at apple.com>, clang developer list <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] Question about custom Clang static analyzer checker registration in clang-tidy without using plugin
Date: 2018-03-15 03:32
All right, so approach #1 by Kan with passing arguments through 
AnalysisConsumer seems to have no problems at all, as long as there's an 
example somewhere in the /examples/ directory that anybody could use and 
see if it works. As far as i understand, it doesn't cause any globals to 
be introduced whatsoever, and keeps all the logic on your side or in 
clang-tidy. Not sure if this boils down to moving the globals to 
clang-tidy. I also understand that it'd be quite some coding, but 
hopefully it'd be good code :)
Approach #3 with weak functions may also work, and you can totally come 
up with an infrastructure on your private library side to make it 
possible to add checkers from multiple places within the library. At the 
same time it would prevent multiple private libraries from being used 
simultaneously (weak function being the global thingy) and i'm not sure 
if it works on all OSes.
Approach #2 with a global sort-of-vector of checkers to check out when 
filling the checker registry is indeed the scariest - it might "just 
work" now, but nobody knows what happens when the number of singletons 
starts growing, so i kinda understand why they don't like to see this 
sort of stuff in clang, even if it's purely religious i guess we'd 
rather not.
On 14/03/2018 1:19 AM, Alexander Kornienko wrote:
> Let me try to resolve the confusion I might have caused. Kan has come 
> up with a static analyzer checker that addresses incorrect use of an 
> internal API (so it doesn't make sense upstream). We can't use 
> dynamically loaded plugins for a number of reasons, so we'd like to 
> have a way to use statically-linked static analyzer checkers with 
> minimal modification to Clang sources (ideally, one-two lines or just 
> linking in a library that would register the checker in a constructor 
> of a global object). I wasn't aware of something like that being 
> possible in the static analyzer, so I've suggested that Kan asks this 
> here.
>
> On Tue, Mar 13, 2018 at 9:46 PM via cfe-dev <cfe-dev at lists.llvm.org 
> <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Thanks Artem for explaining.
>
>     What happened is, I got it work using the plugin system. On the
>     other hand, the clang-tidy team in our company (not Clang's
>     clang-tidy team) wants me to integrate into clang-tidy, and they
>     are not happy about the plugin approach (maybe based on reasons
>     George listed). They want a mechanism registering checkers that
>     are out of Clang source tree, but still statically linked, instead
>     of being dynamically loaded as plugin. I can try asking them to
>     reply to this thread.
>
>     ----- Original Message -----
>     From: Artem Dergachev <noqnoqneo at gmail.com
>     <mailto:noqnoqneo at gmail.com>>
>     To: likan_999.student at sina.com
>     <mailto:likan_999.student at sina.com>, George Karpenkov
>     <ekarpenkov at apple.com <mailto:ekarpenkov at apple.com>>
>     Cc: cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>
>     Subject: Re: [cfe-dev] Question about custom Clang static analyzer
>     checker registration in clang-tidy without using plugin
>     Date: 2018-03-14 03:07
>
>
>     I still don't understand your approach. There's the current plugin
>     system that auto-registers checkers (the trick with "-cc1 -load"
>     demonstrated by examples/analyzer-plugin/MainCallChecker.cpp) and
>     there's the normal way of writing a checker in clang source tree
>     (like
>     all default checkers are written) that semi-automatically (through
>     the
>     registerChecker<>() call) puts the checker into the checker
>     registry and
>     makes it automatically accessible to clang-tidy when it links to the
>     updated checkers library.
>     Because none of these approaches seem to have any problems with
>     checker
>     registration, could you describe what exactly are you doing that
>     causes
>     problems with checker registration?
>     On 13/03/2018 11:11 AM, via cfe-dev wrote:
>     > Hi George,
>     >
>     > Thanks for the fast response. I totally agree with your point of
>     the disadvantages of using plugins. My question was exactly how to
>     avoid using plugins. Maybe I had some misunderstanding, does the
>     "plugin" you talked about not only include dynamic libraries, but
>     also include registering checkers in programs that uses clang as a
>     library?
>     >
>     > For the 2 solutions you propsed, they are good suggestions, but
>     unfortunately they won't work in our specific situation. The
>     checker I am adding is very company specific, and involves
>     specific class in company's code base; I believe it is impossible
>     to make it public, because it will reveal company's internal
>     codebase; also it is useless to public because the situation it
>     tries to analyze is very ad-hoc, company specific, thus won't
>     apply to other people's code at all.
>     >
>     > As for rebasing against upstream, that's not feasible as well
>     because this involves maintenance cost. I am at a large company
>     where I don't have control or any influence on the process it sync
>     third-party libraries with upstream, neither does the clang-tidy
>     team of our company. That's why they suggest me doing it upstream.
>     >
>     > The 3 approaches in my original question was in fact about
>     modifying upstream to expose the checker registration process, so
>     that not only us, but other people could also benefit from it. If
>     there is such an API, then in a program that uses clang as a
>     library, they can call the exposed API to register their own
>     checker, before using AnalysisConsumer to do a static analysis.
>     >
>     > Neither of the 3 approaches I proposed uses plugin architecture.
>     It is just caller program who instantiates checker (or some form
>     of checker factory), then hands it over to Clang, therefore there
>     is no need to worry about how Clang can find the libraries.
>     >
>     > I am more then happy if there are other ideas and I can
>     contribute my manpower to coming up with some patches.
>     >
>     > Best,
>     > Kan Li
>     >
>     > ----- Original Message -----
>     > From: George Karpenkov <ekarpenkov at apple.com
>     <mailto:ekarpenkov at apple.com>>
>     > To: Li Kan <likan_999.student at sina.com
>     <mailto:likan_999.student at sina.com>>
>     > Cc: cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     > Subject: Re: [cfe-dev] Question about custom Clang static
>     analyzer checker registration in clang-tidy without using plugin
>     > Date: 2018-03-14 01:05
>     >
>     >
>     > Hi Li,
>     > I haven’t used plugins myself, but I can see two solutions
>     side-stepping the problem.
>     > 1) Could you upstream your checker and put it into “alpha”, and
>     then turn it on via flag?
>     > 2) If (1) is impossible [though upstreaming into “alpha” should
>     not be too hard], you could fork clang and add your changes there,
>     > and then (hopefully) regularly rebase vs. trunk.
>     > The problem with the plugin infrastructure is that Clang does
>     not guarantee a stable API.
>     > Even though things would probably work, there’s no guarantee
>     they won’t break with a new release,
>     > and if you use plugins you would get all the associated
>     disadvantages (no straightforward injection, have to care about
>     library being found, etc)
>     > without the usual advantages (plugin might unexpectedly stop
>     working).
>     > Though if you absolutely must use plugins, I think it would be
>     possible to come up with an API for doing so.
>     > George
>     >> On Mar 12, 2018, at 6:20 PM, Li Kan via cfe-dev
>     <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>     >>
>     >> Hi,
>     >>
>     >> I have a question about how to register custom checkers without
>     using a plugin. The background is, I have been working on a
>     checker and I was able to come up with a solution to integrate
>     into our code review system by using AnalysisConsumer and load
>     custom checkers using plugins. However, the team of our company
>     who works on clang-tidy integration requests me to do that in
>     clang-tidy, instead of a separate set of things.
>     >>
>     >> The way clang-tidy is used in the code review system is to
>     create ClangTidyASTConsumerFactory and then create a
>     ClangTidyASTConsumer, which is later used to parse the codebase.
>     Therefore the goal here is to find a way to let the users of clang
>     libraries, especially ClangTidyASTConsumer{,Factory}, to let
>     AnalysisConsumer pickup custom checkers statically linked into the
>     program;
>     >>
>     >> Currently the checker registration is very deep in the stack,
>     the only places it looks for checkers are a set of builtin
>     checkers and plugins. There are a few directions I can think of
>     >> 1) Add some arguments to AnalysisConsumer constructor and pass
>     it down to createCheckerManager, and add logic there to look for
>     checkers passed in; however, this is problematic because all the
>     users of AnalysisConsumer, such as ClangTidyASTConsumer, needs to
>     expose the arguments as well;
>     >> 2) Have some static variable that stores the list of checkers
>     factories, and expose a function that allows user to add checkers
>     factories to it. Later the ClangCheckerRegistry picks up the
>     checkers in the list. This is the ordinary way, but the existence
>     of a static variable looks ugly and not thread-safe.
>     >> 3) Like registerBuiltinCheckers, add a weak function
>     registerCustomCheckers, initially empty, and client code is
>     allowed to override it, filled in with their custom checkers. In
>     ClangCheckerRegistry it invokes the function, just after it
>     invokes the registerBuiltinCheckers. This is the least intrusive
>     way but the drawback is if there are multiple places in the system
>     that needs to add a checker, it is not very convenient.
>     >>
>     >> What's the best approach, or is there other better approach? I
>     am more than happy to help and send out some patch to add some
>     mechanism. I want to hear the opinions from Clang developers and
>     get blessed by the direction, before I actually start on it, to
>     avoid any arguments.
>     >>
>     >>
>     >> _______________________________________________
>     >> cfe-dev mailing list
>     >> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>     > _______________________________________________
>     > cfe-dev mailing list
>     > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>