[cfe-dev] Static Analyzer: LiveVariables and SymbolReaper

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Oct 30 02:56:32 PDT 2017


Wanted to add to that, just in case, that our liveness analysis is 
sensitive to the current path. Like, the syntactic variable dies when 
it's no longer referenced anywhere below this point in the CFG (which is 
not path-sensitive), but the *symbol* that was representing the value of 
this variable (such as *the* file descriptor in your example) may still 
live in other variables on some paths, and it wouldn't be reported dead 
until all these variables die.

By definition, *symbols* die when they cannot be extracted from the 
current ProgramState through any events that happen on subsequent 
analysis. Most commonly, this means that dead symbols aren't mentioned 
in the ProgramState at all, but there may be exceptions when the 
information in the ProgramState is redundant (eg. LazyCompoundVal may 
mention a bunch of dead symbols which cannot be extracted from it 
through any program code, simply because we were too lazy to clean this up).

Additionally, dead symbol doesn't automatically mean a leak, because the 
checker needs to additionally track "escape" events, when a symbol 
disappears from the analyzer's point of view, but may still be tracked 
by other parts of the program that the analyzer doesn't know anything 
about (eg. passed as an argument to a function we know nothing about).

On 10/30/17 10:31 AM, Alexander Shaposhnikov wrote:
> I see,
> many thanks for the explanation,
> yeah, it makes a lot of sense.
>
> On Sun, Oct 29, 2017 at 9:56 PM, Anna Zaks <ganna at apple.com 
> <mailto:ganna at apple.com>> wrote:
>
>
>
>     Sent from my iPhone
>
>     On Oct 29, 2017, at 5:19 PM, Alexander Shaposhnikov
>     <alexander.v.shaposhnikov at gmail.com
>     <mailto:alexander.v.shaposhnikov at gmail.com>> wrote:
>
>>     ohh, typo above: i meant
>>          bool isLive(const Stmt *S, const VarDecl *D)
>>     (instead of LiveVariables::isLive(const Stmt *Loc, const Stmt *S))
>>
>>     On Sun, Oct 29, 2017 at 5:08 PM, Alexander Shaposhnikov
>>     <alexander.v.shaposhnikov at gmail.com
>>     <mailto:alexander.v.shaposhnikov at gmail.com>> wrote:
>>
>>         Hi,
>>         (long read, sorry in advance)
>>         i'm looking into the internals of SymbolReaper and
>>         LiveVariables and have
>>         several questions on my mind. Since I have not built a
>>         consistent understanding of what's going on there yet +
>>         assume I might be missing smth, I'd like to start with some
>>         very basic example to show how the (potential) issue
>>         manifests itself there. Any explanations / suggestions /
>>         comments would be greatly appreciated.
>>
>>         Example s.c:
>>         typedef struct _IO_FILE FILE;
>>         extern FILE *fopen(const char *path, const char *mode);
>>
>>         void g() {};
>>         void h() {};
>>
>>         void f(int c) {
>>           FILE *p = fopen("foo.c", "r");
>>           g();
>>           h();
>>           return;
>>         }
>>         clang -cc1 -analyze -analyzer-checker=debug.ViewCFG s.c
>>         says that there 3 blocks: B2(entry) -> B1 -> B0(exit), B2 and
>>         B0 are essentially empty and all the interesting things are
>>         inside B1.
>>         Now let's try to run clang -c --analyze -Xanalyzer
>>         -analyzer-checker=alpha.unix.Stream s.c
>>         <https://maps.google.com/?q=s.c+s.c:8:9&entry=gmail&source=g>
>>
>>         s.c:8:9
>>         <https://maps.google.com/?q=s.c+s.c:8:9&entry=gmail&source=g>:
>>         warning: Value stored to 'p' during its initialization is
>>         never read
>>           FILE *p = fopen("foo.c", "r");
>>                 ^   ~~~~~~~~~~~~~~~~~~~
>>         s.c:9:3: warning: Opened File never closed. Potential
>>         Resource leakv
>>           g();
>>           ^~~
>>         2 warnings generated.
>>
>>         *What looks suspicious here - the location of the warning*
>>         "s.c:9:3: warning: Opened File never closed. Potential
>>         Resource leak"**
>>         *- it points to g() **instead of the end of function or
>>         return statement.*
>>
>
>     The analyzer is trying to report the leak at the *earliest* point.
>     It uses live variable analysis for that:
>     https://en.m.wikipedia.org/wiki/Live_variable_analysis
>     <https://en.m.wikipedia.org/wiki/Live_variable_analysis>
>
>
>>         Now let's run the analyzer under lldb and set the breakpoint at
>>         LiveVariables::isLive(const Stmt *Loc, const Stmt *S)
>>
>>            178 bool LiveVariables::isLive(const Stmt *Loc, const Stmt
>>         *S) {
>>
>>         (lldb) p D->dump()
>>         VarDecl 0x11286c178 <s.c:8:3, col:31> col:9 p 'FILE *' cinit
>>         `-CallExpr 0x11286c350 <col:13, col:31> 'FILE *'
>>           |-ImplicitCastExpr 0x11286c338 <col:13> 'FILE *(*)(const
>>         char *, const char *)' <FunctionToPointerDecay>
>>           | `-DeclRefExpr 0x11286c1d8 <col:13> 'FILE *(const char *,
>>         const char *)' Function 0x11286bce0 'fopen' 'FILE *(const
>>         char *, const char *)'
>>           |-ImplicitCastExpr 0x11286c3a0 <col:19> 'const char *'
>>         <BitCast>
>>           | `-ImplicitCastExpr 0x11286c388 <col:19> 'char *'
>>         <ArrayToPointerDecay>
>>           |   `-StringLiteral 0x11286c238 <col:19> 'char [6]' lvalue
>>         "foo.c"
>>           `-ImplicitCastExpr 0x11286c3d0 <col:28> 'const char *'
>>         <BitCast>
>>             `-ImplicitCastExpr 0x11286c3b8 <col:28> 'char *'
>>         <ArrayToPointerDecay>
>>               `-StringLiteral 0x11286c2a8 <col:28> 'char [2]' lvalue "r"
>>
>>         (lldb) p getImpl(impl).stmtsToLiveness[S].isLive(D)
>>         (bool) $0 = false
>>
>>         (lldb) p S->dump()
>>         CallExpr 0x11286c470 'void'
>>         `-ImplicitCastExpr 0x11286c458 'void (*)()'
>>         <FunctionToPointerDecay>
>>           `-DeclRefExpr 0x11286c400 'void ()' Function 0x11286be18
>>         'g' 'void ()'
>>
>>         (lldb) fr sel 7
>>         frame #7: 0x0000000106ed4397
>>         clang-6.0`clang::ento::ExprEngine::ProcessStmt(this=0x00007fff5fbf97c0,
>>         S=const clang::CFGStmt @ 0x00007fff5fbf8f50,
>>         Pred=0x0000000114004a30) at ExprEngine.cpp:497
>>
>>         Here getImpl(impl).stmtsToLiveness[S].isLive(D) returns false and
>>         Static Analyzer thinks that the variable "p" is dead and will
>>         remove the binding (from the store). Since this variable is
>>         marked as dead,
>>         the corresponding checkers are called
>>         (in ExprEngine::removeDead
>>         getCheckerManager().runCheckersForDeadSymbols ...) and the
>>         stream checker reports a leak.
>>         In fact, if i'm not mistaken, at this point the variable "p"
>>         is not dead yet and the leak should be reported much later at
>>         the end of the function.
>>
>>         The problem does not appear to be specific to StreamChecker,
>>         after playing with several other simple examples, it looks
>>         like LiveVariables::isLive returns false in most cases
>>         (and so does SymbolReaper::isLive(const VarRegion *VR, bool
>>         includeStoreBindings) ).
>>         This results in loosing information about variables or
>>         emitting diagnostics at the wrong locations, for example, see
>>         FIXMEs at
>>         llvm/tools/clang/test/Analysis/unions.cpp +96,
>>         llvm/tools/clang/test/Analysis/self-assign.cpp +41,
>>         llvm/tools/clang/test/Analysis/pr22954.c in "int f19(int i)
>>         {...}", llvm/tools/clang/test/Analysis/pr22954.c in "int
>>         f31() {...}" etc.
>>
>>         So the question here - if CSA works as expected for this example
>>         (in particular, I'm wondering if
>>         LiveVariables::computeLiveness
>>         <https://clang.llvm.org/doxygen/classclang_1_1LiveVariables.html#a00e0b2d4b0298e098cab0f88cbd91a93>
>>
>>         does the right thing or, alternatively, if it's used properly).
>>
>>         P.S. After playing a little bit more with different toy code
>>         snippets and dumping the full state
>>         of LiveVariables (everything: stmtsToLiveness,
>>         blocksEndToLiveness etc) it looks like in most cases it's
>>         almost empty.
>>         P.P.S. The following experiment might be of some interest: if
>>         one switches the method
>>         SymbolReaper::isLive to
>>         (VarContext == CurrentContext)  ||
>>         VarContext->isParentOf(CurrentContext) || isAlwaysAlive(....)
>>         the behavior for some very simple tests (i was looking at the
>>         toy tests with 3-4 blocks in CFG) becomes more predictable
>>         (and a bunch of FIXMEs get corrected),
>>         but this, of course, would not be the correct implementation
>>         in the general case.
>>
>>         Kind regards,
>>         Alexander Shaposhnikov
>>
>>
>




More information about the cfe-dev mailing list