<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 31 May 2019, 02:55 Artem Dergachev, <<a href="mailto:noqnoqneo@gmail.com">noqnoqneo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
Interesting.<br>
<br>
Condition on line 12 would not be our control dependency - neither
in example 1 nor in example 2, simply because there's nothing
interesting within either branches of that if-statement. The 'new
int' assignment would have been interesting if we solved the
should-not-have-happened problem of "x should not have been
overwritten", but for now we don't do that.</div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Thats my game plan exactly :)</div><div dir="auto"></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
At the same time, condition on line 16 would still be a control
dependency of our report in both examples, because this is where the
actual crash happens. So we will have a note on line 05 in both
cases, even if it's completely unnecessary in the second example.<br>
</div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Why would it be? We'd never come across the nullptr dereference if it wasn't for flag having its value reset on line 14.</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
We would indeed never have a note on line 10. This is, in fact, a
dead store, this value is never used, so we wouldn't be able to
track anything back to it.<br>
</div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">I messed up the numbering, I did in fact mean 11 when I wrote 10 :)</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
<div class="m_-4130761129068717718moz-cite-prefix">On 5/30/19 4:48 PM, Kristóf Umann
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Allow me to elaborate.</div>
<div><span><br>
</span></div>
<div><span><b><font size="4">Example 1.:</font></b></span></div>
<div>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">01 int flag;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">02 bool coin();</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">03 </span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">04 void foo() {</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">05 flag = coin(); // no note</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">06 }</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">07 </span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">08 int main() {</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">09 int *x = 0; // x initialized to 0</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">10 flag = 1;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">11 foo();</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">12 if (flag) // assumed false</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">13 x = new int;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">14 foo();</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">15 </span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">16 if (flag) // assumed true</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">17 *x = 5; // warn</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">18 }</span></p>
</div>
<div><span><br>
</span></div>
<div><span><b><font size="4">Example 2.: (note the change on
line 15)</font></b></span></div>
<div>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">01 int flag;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">02 bool coin();</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">03 </span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">04 void foo() {</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">05 flag = coin(); // no note</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">06 }</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">07 </span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">08 int main() {</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">09 int *x = 0;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">10 flag = 1;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">11 foo();</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">12 if (flag) // assumed false</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">13 x = new int;</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">14 foo();</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">15 x = 0; // x set to 0</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">16 if (flag) // assumed true</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">17 *x = 5; // warn</span></p>
<p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">18 }</span></p>
</div>
<div><span id="m_-4130761129068717718gmail-docs-internal-guid-4afc448c-7fff-3e2d-c87d-f55781fa8e89">
<div dir="ltr" style="margin-left:0pt"><br>
</div>
<div style="margin-left:0pt">By tweaking <font face="courier new, monospace">trackExpressionValue</font>
a bit:</div>
<div dir="ltr" style="margin-left:0pt"><br>
</div>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">bool trackExpressionValue(const ExplodedNode *N,</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> const Stmt *S, BugReport &R,</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> bool IsArg=false,</span></p>
<p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> bool EnableNullFPSuppression=true);</span></p>
<br>
<span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Should a block </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">B</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> in the bug path be a control dependency of </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">N</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">, track the expression value in the condition of </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">B</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">. With that, I believe we should get a note for statement 14. For statement 10, we still wouldn’t get a note saying that flag’s value was invalidated (I fear), but it’s more of a should-not-have-happened case. For Example 2 however, we’d get a pretty much perfect bug report.</span></span><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, 31 May 2019 at
00:16, Kristóf Umann <<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, 27 May 2019 at
19:48, Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
On 5/27/19 10:12 AM, Gábor Horváth wrote:<br>
> Hi!<br>
><br>
> I wanted to share some of my points as well but
had little time to do so.<br>
><br>
> Artem identified two kinds of issues. One of
which has all the <br>
> information available in the bug path (in his
terminology: <br>
> must-have-happened) and the other which has not (
<br>
> should-not-have-happened). The goal is the same
in both of the cases, <br>
> identify all dependencies including control and
data dependencies. The <br>
> main difference is that in one case we could use
all the information <br>
> available in the bug path while in the other we
cannot.<br>
><br>
> 1. must-have-happened:<br>
> Since the bug-path contains all the information
we need, I do agree <br>
> with Artem using bug path visitors might be
sufficient. I do not know <br>
> how reliable `trackExpressionValue` is in
identifying data <br>
> dependencies, but revising it could be one step
in GSoC. For example, <br>
> after quickly skimming through the implementation
I am not sure if it <br>
> is able to track the source of a constant value
that was constant <br>
> folded during the symbolic execution from
different values.<br>
<br>
It should be able to, because that's how inlined
defensive check <br>
suppressions work.<br>
</blockquote>
<div><br>
</div>
<div>I recently finished (not yet published) my work on
implementing control dependencies for clang's CFG via
LLVM's <font face="courier new, monospace">IDFCalculator</font>.
Theoretically speaking, by simply tracking the
variables in the block that is a control dependency of
the block that is being tracked, we should be able to
get a proper bug report for Example 1? From afar, it
seems reasonable.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Only if we had a way to test this functionality
in isolation quickly <br>
> :) I think making that possible would also be a
valuable addition.<br>
><br>
> For control dependencies, it is great that LLVM
already has some <br>
> algorithms to calculate dominator sets and it
might be possible to <br>
> make that work for the Clang CFG.<br>
><br>
> In my opinion, if you only tackle these problems
by the end of the <br>
> summer your project already brought great value
to the community.<br>
><br>
> 2. should-not-have-happened<br>
> This category cannot be solved only using
visitors for sure. So we <br>
> might end up with an implementation for this
problem that is <br>
> completely separate from the previous one (maybe
except for getting <br>
> control dependencies?). As Artem pointed out, we
might query some <br>
> information from the analyzer e.g. which
functions were inlined. One <br>
> option is to use a slicing algorithm here.
However, we should make <br>
> sure that the way a function is conservatively
evaluated by the <br>
> slicing algorithm and the analyzer is compatible.
E.g.: if the <br>
> analyzer does not invalidate a memory region
during evaluating the <br>
> call, the statement should not be considered as
relevant for the <br>
> variable in question. We have several challenges
here, the analyzer <br>
> reasons about memory regions while the slicing
algorithm will probably <br>
> not. Also, we need to do something with pointers,
since we do not want <br>
> to implement a full-fledged pointer analysis. We
also need to decide <br>
> how to represent arrays, structures etc.<br>
><br>
> I skimmed through the algorithm in the google
docs, and I think there <br>
> are some other details we need to work out. For
example, you never <br>
> remove anything from the set of relevant
variables. Consider the <br>
> following example:<br>
> x = 5;<br>
> y = x + 2;<br>
> y = 2;<br>
> z = y + 3;<br>
><br>
> Whit the slicing criterion (z = y + 3). Here,
once y is overwritten <br>
> with 2, it is no longer relevant, sox should not
end up in the slice. <br>
> The algorithm you proposed will not handle this
correctly (yet). Also <br>
> I am not sure if we actually need
therelevant_variable_map or having a <br>
> set will be sufficient.<br>
><br>
> Regards,<br>
> Gabor<br>
><br>
> On Mon, 27 May 2019 at 12:52, Kristóf Umann <<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a> <br>
> <mailto:<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>>>
wrote:<br>
><br>
> + Ádám Balogh<br>
><br>
> On Fri, 24 May 2019 at 23:31, Artem Dergachev
<<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a><br>
> <mailto:<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>>>
wrote:<br>
><br>
> Ok, i think i have a little bit of
clarity.<br>
><br>
> Let's first talk about must-have-happened
problems. For instance,<br>
><br>
> int *foo() {<br>
> return coin() ? new int : nullptr;
// needs note<br>
> }<br>
> void bar() {<br>
> int *x = foo();<br>
> *x = 1; // warning: null dereference<br>
> }<br>
><br>
> In this case trackExpressionValue solves
the<br>
> "must-have-happened" problem of "x
must-have been assigned a<br>
> null pointer value" by displaying a note
when foo() returns<br>
> nullptr. This currently more or less
works correctly - there<br>
> are a lot of bugs but the overall idea
behind<br>
> trackExpressionValue is correct.<br>
><br>
> Example 1 in your document is another
example of a<br>
> must-have-happened problem: in order for
the report to be<br>
> valid, we need to show that 'flag'
must-have-changed between<br>
> line 17 and line 13. That's something
that the Static Analyzer<br>
> currently cannot handle and you plan to
improve upon it.<br>
><br>
> Hʏᴘᴏᴛʜᴇsɪs1. All "must-have-happened"
problems should be<br>
> solved with a bug visitor. We don't need
any new analysis<br>
> algorithm for that.<br>
><br>
> In particular, i believe that Example 1
can be solved by<br>
> extending trackExpressionValue() with a
bug visitor that<br>
> detects control-flow-dependencies via
this approach of yours:<br>
><br>
> > "Check whether the statement is a
control dependency of<br>
> Statement 18 with dominator sets: 18
doesn't post dominate 17,<br>
> but every statement in between them does,
meaning that 17 is a<br>
> control dependency of 18."<br>
><br>
> I.e., while ascending through
ExplodedGraph, we pick<br>
> interesting statements and perform this
domination check on<br>
> their predecessor nodes until we reach
the control dependency,<br>
> then we track the control dependency
recursively by adding<br>
> more visitors.<br>
><br>
> I think this is the first thing we should
try on our GSoC.<br>
><br>
> The reason why i believe that Hypothesis
1 is true is that all<br>
> the information that we need is fully
contained in the bug<br>
> path. If something must have happened on
the path, then it's<br>
> probably in the path and we can figure it
out by looking at<br>
> the path. If something must have happened
on the path and it's<br>
> not in the path, then why did we emit a
warning in the first<br>
> place?<br>
><br>
> ==============<br>
><br>
> Now, to should-not-have-happened
problems. The main motivating<br>
> example is:<br>
><br>
> void foo(int *y) {<br>
> if (coin()) // needs note<br>
> *y = 1;<br>
> }<br>
> void bar() {<br>
> int x;<br>
> foo(&x);<br>
> use(x); // warning: use of
uninitialized value<br>
> }<br>
><br>
> In order to make the warning
comprehensible, we have to<br>
> explain to the user why do we think that
'x' is uninitialized,<br>
> as it clearly "should-not-have-been"
initialized for the<br>
> warning to be correct. For that purpose
it is absolutely<br>
> essential to display the execution path
within the inlined<br>
> call to foo().<br>
><br>
> One way to achieve that would be to
display *all* execution<br>
> path and leave it up to the user to see
that the event that<br>
> should-not-have-happened has indeed never
happened.<br>
><br>
> That, however, would be overwhelming, so
we use "path pruning"<br>
> that cuts away execution paths in inlined
calls that are known<br>
> to have no interesting events happening.
Which, in turn, makes<br>
> us incapable of solving
should-not-have-happened problems, as<br>
> it's the whole point of the problem to
display execution path<br>
> in inlined functions in which nothing has
happened.<br>
><br>
> In order to figure this out, we need to
learn how to<br>
> discriminate between inlined calls in
which nothing<br>
> interesting *could* have happened in the
first place and<br>
> inlined calls in which something
interesting could have<br>
> happened but *didn't*.<br>
><br>
> NoStoreFuncVisitor solves this problem
for the example above.<br>
> The visitor notices that the local
variable is passed by<br>
> non-const pointer into the inlined call
and the call<br>
> syntactically contains assignments
through this pointer,<br>
> therefore this call *could* have
initialized the value. The<br>
> visitor then suppresses pruning of this
call by emitting an<br>
> interesting note within the call
("returning without<br>
> initializing '*y'"...).<br>
><br>
> However, AST-based analysis in
NoStoreFuncVisitor is very<br>
> primitive and easy to trick. A more
sophisticated analysis is<br>
> clearly necessary.<br>
><br>
> The second thing we can try in our GSoC
is to come up with a<br>
> better analysis specifically for the
uninitialized value<br>
> checker, because we already have a place
to stick it into and<br>
> we know how exactly do we want to consume
the result of such<br>
> analysis and evaluate it.<br>
><br>
> The third thing we can try in our GSoC is
to come up with more<br>
> such analysis for other checkers and
other<br>
> should-have-happened problems and see if
a reusable framework<br>
> for creating such analyses can be
implemented.<br>
><br>
><br>
> On 5/23/19 1:25 PM, Kristóf Umann wrote:<br>
>><br>
>><br>
>> On Thu, 23 May 2019 at 21:58, Kristóf
Umann<br>
>> <<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>
<mailto:<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>>>
wrote:<br>
>><br>
>> Please let me know if I didn't
address something you<br>
>> mentioned!<br>
>><br>
>> On Thu, 23 May 2019 at 02:29,
Artem Dergachev<br>
>> <<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>
<mailto:<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>>>
wrote:<br>
>><br>
>> My primary question is, how
do you plan to use the<br>
>> data that you obtain via your
analysis?<br>
>><br>
>><br>
>> Gather relevant statements. Keep
ExplodedNodes in the<br>
>> bugpath for which
PathDiagnosticLocation::getStmt() is in<br>
>> the set (or, something like that
at least). But honestly,<br>
>> I didn't think too much about
that yet :^) I guess we<br>
>> could just gather ExplodedNodes
rather than statements,<br>
>> we'll see.<br>
>><br>
>> Like, do you plan to
backtrack the value of all<br>
>> relevant variables and/or
expressions in the final<br>
>> bug report that were also
encountered along the bug<br>
>> path? If yes, then is it
possible that the slicing<br>
>> criterion gets updated in the
process and you'll have<br>
>> to re-run the CFG-based
analysis to take it into account?<br>
>><br>
>><br>
>> No, the slicing criterion is what
it is, and will not<br>
>> change. The set of relevant
statements to the slicing<br>
>> criterion define the actual
program slice, which is<br>
>> basically the thing we're going
for in this project. When<br>
>> it turns out that a value of a
variable directly affects<br>
>> a variable in the slicing
criterion (either through data<br>
>> or control dependency), we just
add it to the set of<br>
>> relevant variables, and then if
something in the set of<br>
>> relevant variables is affected
(some statements or<br>
>> variables turn out to be a
control/data dependencies to<br>
>> them), then it's added to the
respective relevancy sets.<br>
>> Should we only traverse the basic
blocks the analyzer<br>
>> visited, I think we can pull off
the slicing with a<br>
>> single pass.<br>
>><br>
>><br>
>> We'll see about that single pass
thing though. I'm gathering<br>
>> some tricky examples in the document
I shared and working on<br>
>> a neat algorithm.<br>
>><br>
>> > What would also be
really great is to assist this<br>
>> traversal with the
information the analyzer already<br>
>> has, essentially only
inspecting basic blocks the<br>
>> analyzer actually visits.<br>
>><br>
>> You mean visits on the
current bug path or visits on<br>
>> the equivalence class of bugs
or visits in general<br>
>> during analysis?<br>
>><br>
>> Regardless, for many problems
ideally we should<br>
>> traverse basic blocks that
the analyzer *didn't*<br>
>> visit (eg., if the function
didn't initialize the<br>
>> variable on the current path,
we're interested in<br>
>> parts of the code in which it
*did* initialize the<br>
>> variable, even though we
didn't necessarily have time<br>
>> to visit them).<br>
>><br>
>><br>
>> Well I mean slicing by definition
isn't intended to do<br>
>> that. The entire point of it is
to gather the smallest<br>
>> slice related to the slicing
criterion, and my project<br>
>> aims to fill in the gaps where we
don't provide enough<br>
>> information.<br>
>><br>
>> It actually sounds as if all
problems that we're<br>
>> trying to solve here can be
classified into<br>
>> "must-have-happened" problems
(eg., "variable<br>
>> must-have been declared
without initialization",<br>
>> "variable must-have been set
to nullptr", "memory<br>
>> must-have been allocated")
and<br>
>> "should-not-have-happened"
problems (eg., "variable<br>
>> should-not-have been
initialized", "null value<br>
>> should-not-have been
overwritten", "pointer<br>
>> should-not-have escaped").<br>
>><br>
>> For must-have-happened
problems, i'm recently under<br>
>> an impression that we should
suppress the bug report<br>
>> entirely when we fail to
solve them (i.e., if fail to<br>
>> explain to the user where
exactly does this happen,<br>
>> then the report is impossible
to understand anyway).<br>
>> This is currently a very big
problem for our null<br>
>> dereference checker on some
codebases, especially<br>
>> because it uses this tracking
for suppressions as<br>
>> well (aka inlined defensive
checks), so when it fails<br>
>> to track the variable it's
likely a legit false<br>
>> positive as well, not simply
a hard-to-understand report.<br>
>><br>
>><br>
>> I think this set calculating
approach inherently gathers<br>
>> far more information, allowing us
to make better<br>
>> judgement on whether we should
suppress the report.<br>
>><br>
>> For should-not-have-happened
problems i'm much more<br>
>> confused. We're talking about
looking for places<br>
>> where it *could* have
happened and then trying to<br>
>> explain to the user why none
of them have actually<br>
>> happened. I'm not sure what
are the consequences of<br>
>> failing to explain to the
user why didn't a<br>
>> particular piece of code do
something, because i've<br>
>> no idea how do users
intuitively figure out which<br>
>> code *could* have done these
things and which clearly<br>
>> couldn't.<br>
>><br>
>><br>
>> Im really at a loss here :) Can
you provide some example<br>
>> as to what a problematic
"must-have-happened" bug report<br>
>> would look like, and how would
you like to see it<br>
>> improved? Same for
"should-not-have-happened". Because as<br>
>> I understand it, and I might be
wrong, you have problems<br>
>> with this:<br>
>><br>
>> int a; // declaring a without
initializing<br>
>><br>
>> if (rand()) // assuming that the
condition is false<br>
>> a = 5;<br>
>><br>
>> print(a); // a is undef<br>
>><br>
>> And prefer to see this:<br>
>><br>
>> int a; // declaring a without
initializing<br>
>><br>
>> if (rand()) // should this
condition be false, a's value<br>
>> will be indeterministic<br>
>> a = 5; // writing to a skipped<br>
>><br>
>> print(a); // a is undef<br>
>><br>
>> and I just don't see how this
would be possible with<br>
>> slicing at all. Also, I can't see
how this would scale to<br>
>> real-world production code.<br>
>><br>
>><br>
>> On 5/22/19 4:11 PM, Kristóf
Umann wrote:<br>
>>><br>
>>> Hi!<br>
>>><br>
>>><br>
>>> I'd like to share some
thoughts about my GSoC<br>
>>> project, "Enhancing bug
reporting with static<br>
>>> backward program
slicing"[1].<br>
>>><br>
>>><br>
>>> My proposal is very
specific about whatI'm aiming to<br>
>>> improve on, but vague on
the howpart of it. This<br>
>>> mail aims to add clarify
some of this.<br>
>>><br>
>>><br>
>>> Program slicing is
essentially a technique of<br>
>>> discovering data and
control dependencies to the<br>
>>> slicing criterion, which
is a (statement, {set of<br>
>>> variables}) pair.
Fortunately, tools for control<br>
>>> dependencies, namely,
dominator set calculations are<br>
>>> already implemented, but
seems to be unstable with<br>
>>> clang's CFG. It would be
a great tool if I were able<br>
>>> to fix it.<br>
>>><br>
>>><br>
>>> While my proposal states
that I plan to implement an<br>
>>> AST-based solution, I'm
actually not sure that this<br>
>>> is the optimal approach.
We could instead inspect<br>
>>> CFG blocks in a backwards
manner (coupled with the<br>
>>> analyzer's call stack),
and gather relevant variable<br>
>>> in the meanwhile.<br>
>>><br>
>>><br>
>>> What would also be really
great is to assist this<br>
>>> traversal with the
information the analyzer already<br>
>>> has, essentially only
inspecting basic blocks the<br>
>>> analyzer actually visits.<br>
>>><br>
>>><br>
>>> Here’s my idea for an
algorithm (from the point of<br>
>>> the slicing criterion
already being constructed):<br>
>>><br>
>>> <a href="https://docs.google.com/document/d/1Lx867o3meyQsj0WKOSWMdosSBdw2MUq1aIxyPJM6iLU/edit?usp=sharing" rel="noreferrer noreferrer" target="_blank">https://docs.google.com/document/d/1Lx867o3meyQsj0WKOSWMdosSBdw2MUq1aIxyPJM6iLU/edit?usp=sharing</a><br>
>>><br>
>>><br>
>>><br>
>>> Please note that this is
a veeery rough sketch, I<br>
>>> didn't think about every
edge case that exists, but<br>
>>> I think it's an okay
start.<br>
>>><br>
>>><br>
>>> [1]<br>
>>> <a href="https://docs.google.com/document/d/1ci1BCAKojPlqIxIw1J_K2dnATA3z01PuwR_vHJS55TI/edit" rel="noreferrer noreferrer" target="_blank">https://docs.google.com/document/d/1ci1BCAKojPlqIxIw1J_K2dnATA3z01PuwR_vHJS55TI/edit</a><br>
>>><br>
>><br>
><br>
<br>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote></div></div></div>