[cfe-dev] [analyzer][tooling] Analyzer architecture

Wed Apr 29 01:15:22 PDT 2020

Hi!

In order to not overburden the previous discussion about Analyzer and Tooling, I would like to ask you opinions on a related but slightly orthogonal matter.
Gabor and I had a brainstorming session about the issues CTU analysis and compilation command handling (previous topic) brought up recently.
Note that these points are to be regarded as cursory expeditions into the hypothetical (at best).

The train of thought regarding CTU analysis had the following outline:

  *   We need a tool that gets a `FunctionDecl` (the function which we would like to inline) and returns with an AST to its TU.
     *   the fitting abstraction level of the result seems to be the TU level
     *   `externalDefMapping.txt` is just an implementation detail, actually we don't need that.
  *   Let's call this tool `ASTServer`.
  *   ASTServer has some resemblance to `clangd`.
     *   Works on the whole project
     *   Uses compilation DB
     *   Persists already parsed ASTs in its memory (up to a limit)
        *   (Cache eviction strategies? LRU?)
  *   The AST would be returned on a socket and in a serialized form (ASTReader/Writer).
     *   could also work over the network, promoting distribution
  *   We need another tool: `clang-analyzer` !!!
     *   Actually we should have done this earlier
     *   Utilizes clang for analysis purposes
     *   Handles comm with `ASTServer`
        *   Caches ASTs from the server
  *   external orchestrator tool CodeChecker tool would launch ASTServer and then would call clang-analyzer tool for each TU, thus conducting the analysis.
The reasoning behind the separation:
The analyzer is a complex subsystem of Clang. The valid concern of clang binary growing out of proportion, and the increasing need for
tooling dependencies surfacing due to CTU analysis indicate the need reorganizing facilities.
The point is further backed by the argument that a complex functionality of interprocess communication (over sockets in our example)
is even less desirable inside the clang binary than binary size bloat.
Also the complexity of the whole solution could be distributed, and concerns of build system management, build configuration formats
can be separated from the analyzer itself (but allows for a wide variety of build-system vs analysis cooperation schemes to be implemented).

Again, the scope of these ideas is not trivial to assess, and would probably require a considerable amount of effort,
but I hope an open discussion would outline a solution that benefits the structure of the whole project.

Cheers,
Endre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200429/f7d5ffbb/attachment-0001.html>