[cfe-dev] Two pass analysis framework: AST merging approach
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Wed May 4 23:27:50 PDT 2016
> Can you explain what the "two pass analysis" does?
> Ex: Does it loop through each of the TU second time and
> “inline” every call from other TUs? In which order are the other
> TUs loaded? In which order the call sites are processed? Do you
> repeat until no change? Did you measure coverage in some way?
> Did you perform path-sensitive checks? (The time of analysis of 4X
> seems much lower than what I would expect, given that we now
> explore much deeper paths and the analyzer has exponential
> running time.)
On first pass we get a bunch of -emit-ast dumps, on second pass we go
ahead and analyze each translation unit old-style, but whenever we find
an inter-unit CallEvent during path-sensitive analysis, we import the
section of the AST dump containing the function body and all dependent
sections, and inline the call. The inlined call may later trigger more
imports if there are inter-unit calls we'd end up wanting to model.
Yeah, benchmarking is a bit more difficult than that. I think Alexey has
some complicated numbers. I guess the slowdown is only-4x on
path-sensitive checks simply because there are too many drops due to
-analyzer-config max-nodes= limit. The most practical measurement would
probably be to increase limits until the number of reports stops
growing. It's also possible to count number of limit drops, number of
exploded nodes constructed, number of bug reports with and without
unification, we did some of this but not all.
________
My best idea on reducing AST loads is to relax "typedness" requirements
on SVal hierarchy. For example, if an inter-unit function references an
inter-unit static global variable, this variable can probably be
represented as some kind of "untyped VarRegion" (let's call this class
"XTUVarRegion", and inherit it from SubRegion directly, rather than from
TypedValueRegion), and then its type (which may be a complicated class
or template-instantiation declaration) doesn't need to be imported. The
XTUVarRegion should still be uniquely determined by the variable - we
need to know that two different functions imported from that translation
unit refer to the same variable. Not sure - maybe some MemSpace magic
may be employed to control invalidation, maybe we could use separate
memory spaces for different translation units.
More information about the cfe-dev
mailing list