<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><br><br><div id="AppleMailSignature">Sent from my iPhone</div><div><br>On Oct 29, 2017, at 5:19 PM, Alexander Shaposhnikov <<a href="mailto:alexander.v.shaposhnikov@gmail.com">alexander.v.shaposhnikov@gmail.com</a>> wrote:<br><br></div><blockquote type="cite"><div><div dir="ltr">ohh, typo above: i meant <div> bool isLive(const Stmt *S, const VarDecl *D)</div><div>(instead of <span style="font-size:12.8px">LiveVariables::isLive(const Stmt *Loc, const Stmt *S))</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Oct 29, 2017 at 5:08 PM, Alexander Shaposhnikov <span dir="ltr"><<a href="mailto:alexander.v.shaposhnikov@gmail.com" target="_blank">alexander.v.shaposhnikov@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi, <div>(long read, sorry in advance)</div><div>i'm looking into the internals of SymbolReaper and LiveVariables and have </div><div>several questions on my mind. Since I have not built a consistent understanding of what's going on there yet + assume I might be missing smth, I'd like to start with some very basic example to show how the (potential) issue manifests itself there. Any explanations / suggestions / comments would be greatly appreciated.</div><div><br></div><div>Example s.c:</div><div><div>typedef struct _IO_FILE FILE;</div><div>extern FILE *fopen(const char *path, const char *mode);</div><div><br></div><div>void g() {};</div><div>void h() {};</div><div><br></div><div>void f(int c) {</div><div> FILE *p = fopen("foo.c", "r");</div><div> g();</div><div> h();</div><div> return;</div><div>}</div></div><div>clang -cc1 -analyze -analyzer-checker=debug.<wbr>ViewCFG s.c<br></div><div>says that there 3 blocks: B2(entry) -> B1 -> B0(exit), B2 and B0 are essentially empty and all the interesting things are inside B1. </div><div>Now let's try to run clang -c --analyze -Xanalyzer -analyzer-checker=alpha.unix.<wbr>Stream s.c</div><div><div><br></div><div>s.c:8:9: warning: Value stored to 'p' during its initialization is never read</div><div> FILE *p = fopen("foo.c", "r");</div><div> ^ ~~~~~~~~~~~~~~~~~~~</div><div>s.c:9:3: warning: Opened File never closed. Potential Resource leakv</div><div> g();</div><div> ^~~</div><div>2 warnings generated.</div></div><div><div><br></div><div><b>What looks suspicious here - the location of the warning</b></div></div><div>"s.c:9:3: warning: Opened File never closed. Potential Resource leak"<b> </b></div><div><b>- it points to g() </b><b>instead of the end of function or return statement.</b></div><div><br></div></div></blockquote></div></div></div></blockquote><br>The analyzer is trying to report the leak at the <b>earliest</b> point. It uses live variable analysis for that:<br><a href="https://en.m.wikipedia.org/wiki/Live_variable_analysis">https://en.m.wikipedia.org/wiki/Live_variable_analysis</a><div><br></div><div><br><blockquote type="cite"><div><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Now let's run the analyzer under lldb and set the breakpoint at </div><div>LiveVariables::isLive(const Stmt *Loc, const Stmt *S)</div><div><br></div><div><div> 178 <span style="white-space:pre-wrap"> </span>bool LiveVariables::isLive(const Stmt *Loc, const Stmt *S) {</div><div><br></div><div>(lldb) p D->dump()</div><div>VarDecl 0x11286c178 <s.c:8:3, col:31> col:9 p 'FILE *' cinit</div><div>`-CallExpr 0x11286c350 <col:13, col:31> 'FILE *'</div><div> |-ImplicitCastExpr 0x11286c338 <col:13> 'FILE *(*)(const char *, const char *)' <FunctionToPointerDecay></div><div> | `-DeclRefExpr 0x11286c1d8 <col:13> 'FILE *(const char *, const char *)' Function 0x11286bce0 'fopen' 'FILE *(const char *, const char *)'</div><div> |-ImplicitCastExpr 0x11286c3a0 <col:19> 'const char *' <BitCast></div><div> | `-ImplicitCastExpr 0x11286c388 <col:19> 'char *' <ArrayToPointerDecay></div><div> | `-StringLiteral 0x11286c238 <col:19> 'char [6]' lvalue "foo.c"</div><div> `-ImplicitCastExpr 0x11286c3d0 <col:28> 'const char *' <BitCast></div><div> `-ImplicitCastExpr 0x11286c3b8 <col:28> 'char *' <ArrayToPointerDecay></div><div> `-StringLiteral 0x11286c2a8 <col:28> 'char [2]' lvalue "r"</div><div><br></div><div>(lldb) p getImpl(impl).stmtsToLiveness[<wbr>S].isLive(D)</div><div>(bool) $0 = false</div><div><br></div><div>(lldb) p S->dump()</div><div>CallExpr 0x11286c470 'void'</div><div>`-ImplicitCastExpr 0x11286c458 'void (*)()' <FunctionToPointerDecay></div><div> `-DeclRefExpr 0x11286c400 'void ()' Function 0x11286be18 'g' 'void ()'</div><div><br></div></div><div><div>(lldb) fr sel 7</div><div>frame #7: 0x0000000106ed4397 clang-6.0`clang::ento::<wbr>ExprEngine::ProcessStmt(this=<wbr>0x00007fff5fbf97c0, S=const clang::CFGStmt @ 0x00007fff5fbf8f50, Pred=0x0000000114004a30) at ExprEngine.cpp:497</div></div><div><br></div><div>Here getImpl(impl).stmtsToLiveness[<wbr>S].isLive(D) returns false and</div><div>Static Analyzer thinks that the variable "p" is dead and will remove the binding (from the store). Since this variable is marked as dead, </div><div>the corresponding checkers are called (in ExprEngine::removeDead getCheckerManager().<wbr>runCheckersForDeadSymbols ...) and the stream checker reports a leak. </div><div>In fact, if i'm not mistaken, at this point the variable "p" is not dead yet and the leak should be reported much later at the end of the function.</div><div><br></div><div>The problem does not appear to be specific to StreamChecker, after playing with several other simple examples, it looks like LiveVariables::isLive returns false in most cases</div><div>(and so does SymbolReaper::isLive(<wbr>const VarRegion *VR, bool includeStoreBindings) ).</div><div>This results in loosing information about variables or emitting diagnostics at the wrong locations, for example, see FIXMEs at </div><div>llvm/tools/clang/test/<wbr>Analysis/unions.cpp +96, </div><div>llvm/tools/clang/test/<wbr>Analysis/self-assign.cpp +41, </div><div>llvm/tools/clang/test/<wbr>Analysis/pr22954.c in "int f19(int i) {...}", llvm/tools/clang/test/<wbr>Analysis/pr22954.c in "int f31() {...}" etc.</div><div><br></div><div>So the question here - if CSA works as expected for this example </div><div>(in particular, I'm wondering if <a class="m_4845423922169719039gmail-code" href="https://clang.llvm.org/doxygen/classclang_1_1LiveVariables.html#a00e0b2d4b0298e098cab0f88cbd91a93" style="font-size:13px;color:rgb(70,101,162);font-family:monospace,fixed;white-space:pre-wrap;background-color:rgb(251,252,253)" target="_blank">LiveVariables::<wbr>computeLiveness</a> </div><div>does the right thing or, alternatively, if it's used properly). </div><div><br></div><div>P.S. After playing a little bit more with different toy code snippets and dumping the full state</div><div>of LiveVariables (everything: stmtsToLiveness, <wbr>blocksEndToLiveness etc) it looks like in most cases it's almost empty. </div><div>P.P.S. The following experiment might be of some interest: if one switches the method</div><div>SymbolReaper::isLive to </div><div>(VarContext == CurrentContext) || VarContext->isParentOf(<wbr>CurrentContext) || isAlwaysAlive(....) </div><div>the behavior for some very simple tests (i was looking at the toy tests with 3-4 blocks in CFG) becomes more predictable (and a bunch of FIXMEs get corrected), </div><div>but this, of course, would not be the correct implementation in the general case. </div><div><br></div><div>Kind regards,</div><div>Alexander Shaposhnikov</div></div>
</blockquote></div><br></div>
</div></blockquote></div></body></html>