[cfe-dev] [analyzer][RFC] Get info from the LLVM IR for precision

Gábor Márton via cfe-dev cfe-dev at lists.llvm.org
Fri Aug 14 04:19:21 PDT 2020


John, thank you for your reply.

> Is this really the most reasonable way to get the information you want?
Here is a list of information we would like to have access to (this is
non-comprehensive, Artem could probably extend it) :
1) Is a function pure?
2) Does a function read/write only the memory pointed to by its arguments?
3) Does a calle make any copies of the pointer argument that outlive the
callee itself?
4) Value ranges.
5) Is a loop dead?
6) Is a parameter or return pointer is dereferenceable?

How could we use this information?
With 1-3 we could make the analysis more precise by improving the
over-approximation done by invalidation during conservative evaluation.
Using the info from 1-4 we could create "summaries" for functions and we
could skip the inlining based evaluation of them. This would be really
beneficial in case of cross-translation-unit analysis where the inling
stack can grow really deep.
With 5, we could skip the analysis of dead loops and thus could spare the
budget for the symbolic execution in CSA.
By using 6, we could eliminate some false-positive reports, this way
improving correctness.

Some of the analyses that provide the needed information can be implemented
properly only by using the SSA form. For example, value range propagation.
We could do our own way of lowering to SSA, or our own implementation of
alias analysis for the pureness info, but that would be repeating the work
that had already been done and well tested in LLVM.

> It’s also pretty expensive.
I completely agree that we should not pay for those optimization passes
whose results we cannot use in the CSA. In the first version of the patch I
used the whole O2 pipeline, but lately I updated it to use only those
passes that are needed to get the pureness information (GlobalsAA and
PostOrderFunctionAttrs).
Also, static analysis is generally considered to be slower than
compilation even
with optimizations enabled. We even advertise this in our official webpage (
here <https://clang-analyzer.llvm.org/>). And this extension will never be
more expensive than a regular O2/O3 compilation. So, this implies that a
2-4x slowdown of CSA could become a 3-5x slowdown, compared to an O2
compilation. In CTU mode, the analysis time is even slower currently, so
the additional CodeGen would be less noticable. The slowdown may not be
affordable for some clients, so users must explicitly require CodeGen in
CSA via a command-line switch. I plan to provide precise results on
open-source projects to measure the slowdown. On top of that, it would be
interesting to see how many times can we get the desired information in the
ratio of all functions (all call sites, all loops).

Gabor.



On Fri, Aug 14, 2020 at 6:46 AM John McCall <rjmccall at apple.com> wrote:

> On 13 Aug 2020, at 10:15, Gábor Márton wrote:
> > Artem, John,
> >
> > How should we proceed with this?
> >
> > John, you mention in the patch that this is a huge architectural
> > change.
> > Could you please elaborate? Are you concerned about the additional
> > libs
> > that are being linked to the static analyzer libraries? The clang
> > binary is
> > already dependent on LLVM libs and on the CodeGen and CSA is builtin
> > to the
> > clang binary. Are you concerned about having a MultiplexConsumer as an
> > ASTConsumer? ... I am open to any suggestions, but I need more input
> > from you.
>
> Well, it’s adding a major new dependency to the static analyzer and a
> major new client to IRGen.  In both cases, the dependency/client happens
> to be another part of Clang, but still, it seems like a huge deal for
> static analysis to start depending on potentially arbitrary details of
> code generation and LLVM optimization.  It’s also pretty expensive.
> Is this really the most reasonable way to get the information you want?
>
> John.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200814/129d0ef7/attachment-0001.html>


More information about the cfe-dev mailing list