<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 31 May 2019, 02:55 Artem Dergachev, <<a href="mailto:noqnoqneo@gmail.com">noqnoqneo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    Interesting.<br>
    <br>
    Condition on line 12 would not be our control dependency - neither
    in example 1 nor in example 2, simply because there's nothing
    interesting within either branches of that if-statement. The 'new
    int' assignment would have been interesting if we solved the
    should-not-have-happened problem of "x should not have been
    overwritten", but for now we don't do that.</div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Thats my game plan exactly :)</div><div dir="auto"></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
    At the same time, condition on line 16 would still be a control
    dependency of our report in both examples, because this is where the
    actual crash happens. So we will have a note on line 05 in both
    cases, even if it's completely unnecessary in the second example.<br>
    </div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Why would it be? We'd never come across the nullptr dereference if it wasn't for flag having its value reset on line 14.</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
    We would indeed never have a note on line 10. This is, in fact, a
    dead store, this value is never used, so we wouldn't be able to
    track anything back to it.<br>
    </div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">I messed up the numbering, I did in fact mean 11 when I wrote 10 :)</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><br>
    <div class="m_-4130761129068717718moz-cite-prefix">On 5/30/19 4:48 PM, Kristóf Umann
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>Allow me to elaborate.</div>
        <div><span><br>
          </span></div>
        <div><span><b><font size="4">Example 1.:</font></b></span></div>
        <div>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">01 int flag;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">02 bool coin();</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">03 </span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">04 void foo() {</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">05   flag = coin(); // no note</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">06 }</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">07 </span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">08 int main() {</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">09   int *x = 0; // x initialized to 0</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">10   flag = 1;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">11   foo();</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">12   if (flag) // assumed false</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">13     x = new int;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">14   foo();</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">15 </span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">16   if (flag) // assumed true</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">17     *x = 5; // warn</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">18 }</span></p>
        </div>
        <div><span><br>
          </span></div>
        <div><span><b><font size="4">Example 2.: (note the change on
                line 15)</font></b></span></div>
        <div>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">01 int flag;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">02 bool coin();</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">03 </span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">04 void foo() {</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">05   flag = coin(); // no note</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">06 }</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">07 </span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">08 int main() {</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">09   int *x = 0;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">10   flag = 1;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">11   foo();</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">12   if (flag) // assumed false</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">13     x = new int;</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">14   foo();</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">15   x = 0; // x set to 0</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">16   if (flag) // assumed true</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">17     *x = 5; // warn</span></p>
          <p dir="ltr" style="font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">18 }</span></p>
        </div>
        <div><span id="m_-4130761129068717718gmail-docs-internal-guid-4afc448c-7fff-3e2d-c87d-f55781fa8e89">
            <div dir="ltr" style="margin-left:0pt"><br>
            </div>
            <div style="margin-left:0pt">By tweaking <font face="courier new, monospace">trackExpressionValue</font>
              a bit:</div>
            <div dir="ltr" style="margin-left:0pt"><br>
            </div>
            <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">bool trackExpressionValue(const ExplodedNode *N,</span></p>
            <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">                          const Stmt *S, BugReport &R,</span></p>
            <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">                          bool IsArg=false,</span></p>
            <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:9pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">                          bool EnableNullFPSuppression=true);</span></p>
            <br>
            <span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Should a block </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">B</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> in the bug path be a control dependency of </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">N</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">, track the expression value in the condition of </span><span style="font-size:11pt;font-family:"Courier New";color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">B</span><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">. With that, I believe we should get a note for statement 14. For statement 10, we still wouldn’t get a note saying that flag’s value was invalidated (I fear), but it’s more of a should-not-have-happened case. For Example 2 however, we’d get a pretty much perfect bug report.</span></span><br>
        </div>
        <br>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">On Fri, 31 May 2019 at
            00:16, Kristóf Umann <<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>>
            wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div dir="ltr">
              <div dir="ltr"><br>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Mon, 27 May 2019 at
                  19:48, Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
                  <br>
                  On 5/27/19 10:12 AM, Gábor Horváth wrote:<br>
                  > Hi!<br>
                  ><br>
                  > I wanted to share some of my points as well but
                  had little time to do so.<br>
                  ><br>
                  > Artem identified two kinds of issues. One of
                  which has all the <br>
                  > information available in the bug path (in his
                  terminology: <br>
                  > must-have-happened) and the other which has not (
                  <br>
                  > should-not-have-happened). The goal is the same
                  in both of the cases, <br>
                  > identify all dependencies including control and
                  data dependencies. The <br>
                  > main difference is that in one case we could use
                  all the information <br>
                  > available in the bug path while in the other we
                  cannot.<br>
                  ><br>
                  > 1. must-have-happened:<br>
                  > Since the bug-path contains all the information
                  we need, I do agree <br>
                  > with Artem using bug path visitors might be
                  sufficient. I do not know <br>
                  > how reliable `trackExpressionValue` is in
                  identifying data <br>
                  > dependencies, but revising it could be one step
                  in GSoC. For example, <br>
                  > after quickly skimming through the implementation
                  I am not sure if it <br>
                  > is able to track the source of a constant value
                  that was constant <br>
                  > folded during the symbolic execution from
                  different values.<br>
                  <br>
                  It should be able to, because that's how inlined
                  defensive check <br>
                  suppressions work.<br>
                </blockquote>
                <div><br>
                </div>
                <div>I recently finished (not yet published) my work on
                  implementing control dependencies for clang's CFG via
                  LLVM's <font face="courier new, monospace">IDFCalculator</font>.
                  Theoretically speaking, by simply tracking the
                  variables in the block that is a control dependency of
                  the block that is being tracked, we should be able to
                  get a proper bug report for Example 1? From afar, it
                  seems reasonable.</div>
                <div> </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                  > Only if we had a way to test this functionality
                  in isolation quickly <br>
                  > :) I think making that possible would also be a
                  valuable addition.<br>
                  ><br>
                  > For control dependencies, it is great that LLVM
                  already has some <br>
                  > algorithms to calculate dominator sets and it
                  might be possible to <br>
                  > make that work for the Clang CFG.<br>
                  ><br>
                  > In my opinion, if you only tackle these problems
                  by the end of the <br>
                  > summer your project already brought great value
                  to the community.<br>
                  ><br>
                  > 2. should-not-have-happened<br>
                  > This category cannot be solved only using
                  visitors for sure. So we <br>
                  > might end up with an implementation for this
                  problem that is <br>
                  > completely separate from the previous one (maybe
                  except for getting <br>
                  > control dependencies?). As Artem pointed out, we
                  might query some <br>
                  > information from the analyzer e.g. which
                  functions were inlined. One <br>
                  > option is to use a slicing algorithm here.
                  However, we should make <br>
                  > sure that the way a function is conservatively
                  evaluated by the <br>
                  > slicing algorithm and the analyzer is compatible.
                  E.g.: if the <br>
                  > analyzer does not invalidate a memory region
                  during evaluating the <br>
                  > call, the statement should not be considered as
                  relevant for the <br>
                  > variable in question. We have several challenges
                  here, the analyzer <br>
                  > reasons about memory regions while the slicing
                  algorithm will probably <br>
                  > not. Also, we need to do something with pointers,
                  since we do not want <br>
                  > to implement a full-fledged pointer analysis. We
                  also need to decide <br>
                  > how to represent arrays, structures etc.<br>
                  ><br>
                  > I skimmed through the algorithm in the google
                  docs, and I think there <br>
                  > are some other details we need to work out. For
                  example, you never <br>
                  > remove anything from the set of relevant
                  variables. Consider the <br>
                  > following example:<br>
                  > x = 5;<br>
                  > y = x + 2;<br>
                  > y = 2;<br>
                  > z = y + 3;<br>
                  ><br>
                  > Whit the slicing criterion (z = y + 3). Here,
                  once y is overwritten <br>
                  > with 2, it is no longer relevant, sox should not
                  end up in the slice. <br>
                  > The algorithm you proposed will not handle this
                  correctly (yet). Also <br>
                  > I am not sure if we actually need
                  therelevant_variable_map or having a <br>
                  > set will be sufficient.<br>
                  ><br>
                  > Regards,<br>
                  > Gabor<br>
                  ><br>
                  > On Mon, 27 May 2019 at 12:52, Kristóf Umann <<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a> <br>
                  > <mailto:<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>>>
                  wrote:<br>
                  ><br>
                  >     + Ádám Balogh<br>
                  ><br>
                  >     On Fri, 24 May 2019 at 23:31, Artem Dergachev
                  <<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a><br>
                  >     <mailto:<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>>>
                  wrote:<br>
                  ><br>
                  >         Ok, i think i have a little bit of
                  clarity.<br>
                  ><br>
                  >         Let's first talk about must-have-happened
                  problems. For instance,<br>
                  ><br>
                  >           int *foo() {<br>
                  >               return coin() ? new int : nullptr;
                  // needs note<br>
                  >           }<br>
                  >           void bar() {<br>
                  >             int *x = foo();<br>
                  >             *x = 1; // warning: null dereference<br>
                  >           }<br>
                  ><br>
                  >         In this case trackExpressionValue solves
                  the<br>
                  >         "must-have-happened" problem of "x
                  must-have been assigned a<br>
                  >         null pointer value" by displaying a note
                  when foo() returns<br>
                  >         nullptr. This currently more or less
                  works correctly - there<br>
                  >         are a lot of bugs but the overall idea
                  behind<br>
                  >         trackExpressionValue is correct.<br>
                  ><br>
                  >         Example 1 in your document is another
                  example of a<br>
                  >         must-have-happened problem: in order for
                  the report to be<br>
                  >         valid, we need to show that 'flag'
                  must-have-changed between<br>
                  >         line 17 and line 13. That's something
                  that the Static Analyzer<br>
                  >         currently cannot handle and you plan to
                  improve upon it.<br>
                  ><br>
                  >         Hʏᴘᴏᴛʜᴇsɪs1. All "must-have-happened"
                  problems should be<br>
                  >         solved with a bug visitor. We don't need
                  any new analysis<br>
                  >         algorithm for that.<br>
                  ><br>
                  >         In particular, i believe that Example 1
                  can be solved by<br>
                  >         extending trackExpressionValue() with a
                  bug visitor that<br>
                  >         detects control-flow-dependencies via
                  this approach of yours:<br>
                  ><br>
                  >         > "Check whether the statement is a
                  control dependency of<br>
                  >         Statement 18 with dominator sets: 18
                  doesn't post dominate 17,<br>
                  >         but every statement in between them does,
                  meaning that 17 is a<br>
                  >         control dependency of 18."<br>
                  ><br>
                  >         I.e., while ascending through
                  ExplodedGraph, we pick<br>
                  >         interesting statements and perform this
                  domination check on<br>
                  >         their predecessor nodes until we reach
                  the control dependency,<br>
                  >         then we track the control dependency
                  recursively by adding<br>
                  >         more visitors.<br>
                  ><br>
                  >         I think this is the first thing we should
                  try on our GSoC.<br>
                  ><br>
                  >         The reason why i believe that Hypothesis
                  1 is true is that all<br>
                  >         the information that we need is fully
                  contained in the bug<br>
                  >         path. If something must have happened on
                  the path, then it's<br>
                  >         probably in the path and we can figure it
                  out by looking at<br>
                  >         the path. If something must have happened
                  on the path and it's<br>
                  >         not in the path, then why did we emit a
                  warning in the first<br>
                  >         place?<br>
                  ><br>
                  >         ==============<br>
                  ><br>
                  >         Now, to should-not-have-happened
                  problems. The main motivating<br>
                  >         example is:<br>
                  ><br>
                  >           void foo(int *y) {<br>
                  >               if (coin()) // needs note<br>
                  >                 *y = 1;<br>
                  >           }<br>
                  >           void bar() {<br>
                  >             int x;<br>
                  >             foo(&x);<br>
                  >             use(x); // warning: use of
                  uninitialized value<br>
                  >           }<br>
                  ><br>
                  >         In order to make the warning
                  comprehensible, we have to<br>
                  >         explain to the user why do we think that
                  'x' is uninitialized,<br>
                  >         as it clearly "should-not-have-been"
                  initialized for the<br>
                  >         warning to be correct. For that purpose
                  it is absolutely<br>
                  >         essential to display the execution path
                  within the inlined<br>
                  >         call to foo().<br>
                  ><br>
                  >         One way to achieve that would be to
                  display *all* execution<br>
                  >         path and leave it up to the user to see
                  that the event that<br>
                  >         should-not-have-happened has indeed never
                  happened.<br>
                  ><br>
                  >         That, however, would be overwhelming, so
                  we use "path pruning"<br>
                  >         that cuts away execution paths in inlined
                  calls that are known<br>
                  >         to have no interesting events happening.
                  Which, in turn, makes<br>
                  >         us incapable of solving
                  should-not-have-happened problems, as<br>
                  >         it's the whole point of the problem to
                  display execution path<br>
                  >         in inlined functions in which nothing has
                  happened.<br>
                  ><br>
                  >         In order to figure this out, we need to
                  learn how to<br>
                  >         discriminate between inlined calls in
                  which nothing<br>
                  >         interesting *could* have happened in the
                  first place and<br>
                  >         inlined calls in which something
                  interesting could have<br>
                  >         happened but *didn't*.<br>
                  ><br>
                  >         NoStoreFuncVisitor solves this problem
                  for the example above.<br>
                  >         The visitor notices that the local
                  variable is passed by<br>
                  >         non-const pointer into the inlined call
                  and the call<br>
                  >         syntactically contains assignments
                  through this pointer,<br>
                  >         therefore this call *could* have
                  initialized the value. The<br>
                  >         visitor then suppresses pruning of this
                  call by emitting an<br>
                  >         interesting note within the call
                  ("returning without<br>
                  >         initializing '*y'"...).<br>
                  ><br>
                  >         However, AST-based analysis in
                  NoStoreFuncVisitor is very<br>
                  >         primitive and easy to trick. A more
                  sophisticated analysis is<br>
                  >         clearly necessary.<br>
                  ><br>
                  >         The second thing we can try in our GSoC
                  is to come up with a<br>
                  >         better analysis specifically for the
                  uninitialized value<br>
                  >         checker, because we already have a place
                  to stick it into and<br>
                  >         we know how exactly do we want to consume
                  the result of such<br>
                  >         analysis and evaluate it.<br>
                  ><br>
                  >         The third thing we can try in our GSoC is
                  to come up with more<br>
                  >         such analysis for other checkers and
                  other<br>
                  >         should-have-happened problems and see if
                  a reusable framework<br>
                  >         for creating such analyses can be
                  implemented.<br>
                  ><br>
                  ><br>
                  >         On 5/23/19 1:25 PM, Kristóf Umann wrote:<br>
                  >><br>
                  >><br>
                  >>         On Thu, 23 May 2019 at 21:58, Kristóf
                  Umann<br>
                  >>         <<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>
                  <mailto:<a href="mailto:dkszelethus@gmail.com" target="_blank" rel="noreferrer">dkszelethus@gmail.com</a>>>
                  wrote:<br>
                  >><br>
                  >>             Please let me know if I didn't
                  address something you<br>
                  >>             mentioned!<br>
                  >><br>
                  >>             On Thu, 23 May 2019 at 02:29,
                  Artem Dergachev<br>
                  >>             <<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>
                  <mailto:<a href="mailto:noqnoqneo@gmail.com" target="_blank" rel="noreferrer">noqnoqneo@gmail.com</a>>>
                  wrote:<br>
                  >><br>
                  >>                 My primary question is, how
                  do you plan to use the<br>
                  >>                 data that you obtain via your
                  analysis?<br>
                  >><br>
                  >><br>
                  >>              Gather relevant statements. Keep
                  ExplodedNodes in the<br>
                  >>             bugpath for which
                  PathDiagnosticLocation::getStmt() is in<br>
                  >>             the set (or, something like that
                  at least). But honestly,<br>
                  >>             I didn't think too much about
                  that yet :^) I guess we<br>
                  >>             could just gather ExplodedNodes
                  rather than statements,<br>
                  >>             we'll see.<br>
                  >><br>
                  >>                 Like, do you plan to
                  backtrack the value of all<br>
                  >>                 relevant variables and/or
                  expressions in the final<br>
                  >>                 bug report that were also
                  encountered along the bug<br>
                  >>                 path? If yes, then is it
                  possible that the slicing<br>
                  >>                 criterion gets updated in the
                  process and you'll have<br>
                  >>                 to re-run the CFG-based
                  analysis to take it into account?<br>
                  >><br>
                  >><br>
                  >>             No, the slicing criterion is what
                  it is, and will not<br>
                  >>             change. The set of relevant
                  statements to the slicing<br>
                  >>             criterion define the actual
                  program slice, which is<br>
                  >>             basically the thing we're going
                  for in this project. When<br>
                  >>             it turns out that a value of a
                  variable directly affects<br>
                  >>             a variable in the slicing
                  criterion (either through data<br>
                  >>             or control dependency), we just
                  add it to the set of<br>
                  >>             relevant variables, and then if
                  something in the set of<br>
                  >>             relevant variables is affected
                  (some statements or<br>
                  >>             variables turn out to be a
                  control/data dependencies to<br>
                  >>             them), then it's added to the
                  respective relevancy sets.<br>
                  >>             Should we only traverse the basic
                  blocks the analyzer<br>
                  >>             visited, I think we can pull off
                  the slicing with a<br>
                  >>             single pass.<br>
                  >><br>
                  >><br>
                  >>         We'll see about that single pass
                  thing though. I'm gathering<br>
                  >>         some tricky examples in the document
                  I shared and working on<br>
                  >>         a neat algorithm.<br>
                  >><br>
                  >>                 > What would also be
                  really great is to assist this<br>
                  >>                 traversal with the
                  information the analyzer already<br>
                  >>                 has, essentially only
                  inspecting basic blocks the<br>
                  >>                 analyzer actually visits.<br>
                  >><br>
                  >>                 You mean visits on the
                  current bug path or visits on<br>
                  >>                 the equivalence class of bugs
                  or visits in general<br>
                  >>                 during analysis?<br>
                  >><br>
                  >>                 Regardless, for many problems
                  ideally we should<br>
                  >>                 traverse basic blocks that
                  the analyzer *didn't*<br>
                  >>                 visit (eg., if the function
                  didn't initialize the<br>
                  >>                 variable on the current path,
                  we're interested in<br>
                  >>                 parts of the code in which it
                  *did* initialize the<br>
                  >>                 variable, even though we
                  didn't necessarily have time<br>
                  >>                 to visit them).<br>
                  >><br>
                  >><br>
                  >>             Well I mean slicing by definition
                  isn't intended to do<br>
                  >>             that. The entire point of it is
                  to gather the smallest<br>
                  >>             slice related to the slicing
                  criterion, and my project<br>
                  >>             aims to fill in the gaps where we
                  don't provide enough<br>
                  >>             information.<br>
                  >><br>
                  >>                 It actually sounds as if all
                  problems that we're<br>
                  >>                 trying to solve here can be
                  classified into<br>
                  >>                 "must-have-happened" problems
                  (eg., "variable<br>
                  >>                 must-have been declared
                  without initialization",<br>
                  >>                 "variable must-have been set
                  to nullptr", "memory<br>
                  >>                 must-have been allocated")
                  and<br>
                  >>                 "should-not-have-happened"
                  problems (eg., "variable<br>
                  >>                 should-not-have been
                  initialized", "null value<br>
                  >>                 should-not-have been
                  overwritten", "pointer<br>
                  >>                 should-not-have escaped").<br>
                  >><br>
                  >>                 For must-have-happened
                  problems, i'm recently under<br>
                  >>                 an impression that we should
                  suppress the bug report<br>
                  >>                 entirely when we fail to
                  solve them (i.e., if fail to<br>
                  >>                 explain to the user where
                  exactly does this happen,<br>
                  >>                 then the report is impossible
                  to understand anyway).<br>
                  >>                 This is currently a very big
                  problem for our null<br>
                  >>                 dereference checker on some
                  codebases, especially<br>
                  >>                 because it uses this tracking
                  for suppressions as<br>
                  >>                 well (aka inlined defensive
                  checks), so when it fails<br>
                  >>                 to track the variable it's
                  likely a legit false<br>
                  >>                 positive as well, not simply
                  a hard-to-understand report.<br>
                  >><br>
                  >><br>
                  >>             I think this set calculating
                  approach inherently gathers<br>
                  >>             far more information, allowing us
                  to make better<br>
                  >>             judgement on whether we should
                  suppress the report.<br>
                  >><br>
                  >>                 For should-not-have-happened
                  problems i'm much more<br>
                  >>                 confused. We're talking about
                  looking for places<br>
                  >>                 where it *could* have
                  happened and then trying to<br>
                  >>                 explain to the user why none
                  of them have actually<br>
                  >>                 happened. I'm not sure what
                  are the consequences of<br>
                  >>                 failing to explain to the
                  user why didn't a<br>
                  >>                 particular piece of code do
                  something, because i've<br>
                  >>                 no idea how do users
                  intuitively figure out which<br>
                  >>                 code *could* have done these
                  things and which clearly<br>
                  >>                 couldn't.<br>
                  >><br>
                  >><br>
                  >>             Im really at a loss here :) Can
                  you provide some example<br>
                  >>             as to what a problematic
                  "must-have-happened" bug report<br>
                  >>             would look like, and how would
                  you like to see it<br>
                  >>             improved? Same for
                  "should-not-have-happened". Because as<br>
                  >>             I understand it, and I might be
                  wrong, you have problems<br>
                  >>             with this:<br>
                  >><br>
                  >>             int a; // declaring a without
                  initializing<br>
                  >><br>
                  >>             if (rand()) // assuming that the
                  condition is false<br>
                  >>             a = 5;<br>
                  >><br>
                  >>             print(a); // a is undef<br>
                  >><br>
                  >>             And prefer to see this:<br>
                  >><br>
                  >>             int a; // declaring a without
                  initializing<br>
                  >><br>
                  >>             if (rand()) // should this
                  condition be false, a's value<br>
                  >>             will be indeterministic<br>
                  >>             a = 5; // writing to a skipped<br>
                  >><br>
                  >>             print(a); // a is undef<br>
                  >><br>
                  >>             and I just don't see how this
                  would be possible with<br>
                  >>             slicing at all. Also, I can't see
                  how this would scale to<br>
                  >>             real-world production code.<br>
                  >><br>
                  >><br>
                  >>                 On 5/22/19 4:11 PM, Kristóf
                  Umann wrote:<br>
                  >>><br>
                  >>>                 Hi!<br>
                  >>><br>
                  >>><br>
                  >>>                 I'd like to share some
                  thoughts about my GSoC<br>
                  >>>                 project, "Enhancing bug
                  reporting with static<br>
                  >>>                 backward program
                  slicing"[1].<br>
                  >>><br>
                  >>><br>
                  >>>                 My proposal is very
                  specific about whatI'm aiming to<br>
                  >>>                 improve on, but vague on
                  the howpart of it. This<br>
                  >>>                 mail aims to add clarify
                  some of this.<br>
                  >>><br>
                  >>><br>
                  >>>                 Program slicing is
                  essentially a technique of<br>
                  >>>                 discovering data and
                  control dependencies to the<br>
                  >>>                 slicing criterion, which
                  is a (statement, {set of<br>
                  >>>                 variables}) pair.
                  Fortunately, tools for control<br>
                  >>>                 dependencies, namely,
                  dominator set calculations are<br>
                  >>>                 already implemented, but
                  seems to be unstable with<br>
                  >>>                 clang's CFG. It would be
                  a great tool if I were able<br>
                  >>>                 to fix it.<br>
                  >>><br>
                  >>><br>
                  >>>                 While my proposal states
                  that I plan to implement an<br>
                  >>>                 AST-based solution, I'm
                  actually not sure that this<br>
                  >>>                 is the optimal approach.
                  We could instead inspect<br>
                  >>>                 CFG blocks in a backwards
                  manner (coupled with the<br>
                  >>>                 analyzer's call stack),
                  and gather relevant variable<br>
                  >>>                 in the meanwhile.<br>
                  >>><br>
                  >>><br>
                  >>>                 What would also be really
                  great is to assist this<br>
                  >>>                 traversal with the
                  information the analyzer already<br>
                  >>>                 has, essentially only
                  inspecting basic blocks the<br>
                  >>>                 analyzer actually visits.<br>
                  >>><br>
                  >>><br>
                  >>>                 Here’s my idea for an
                  algorithm (from the point of<br>
                  >>>                 the slicing criterion
                  already being constructed):<br>
                  >>><br>
                  >>>                 <a href="https://docs.google.com/document/d/1Lx867o3meyQsj0WKOSWMdosSBdw2MUq1aIxyPJM6iLU/edit?usp=sharing" rel="noreferrer noreferrer" target="_blank">https://docs.google.com/document/d/1Lx867o3meyQsj0WKOSWMdosSBdw2MUq1aIxyPJM6iLU/edit?usp=sharing</a><br>
                  >>><br>
                  >>><br>
                  >>><br>
                  >>>                 Please note that this is
                  a veeery rough sketch, I<br>
                  >>>                 didn't think about every
                  edge case that exists, but<br>
                  >>>                 I think it's an okay
                  start.<br>
                  >>><br>
                  >>><br>
                  >>>                 [1]<br>
                  >>>                 <a href="https://docs.google.com/document/d/1ci1BCAKojPlqIxIw1J_K2dnATA3z01PuwR_vHJS55TI/edit" rel="noreferrer noreferrer" target="_blank">https://docs.google.com/document/d/1ci1BCAKojPlqIxIw1J_K2dnATA3z01PuwR_vHJS55TI/edit</a><br>
                  >>><br>
                  >><br>
                  ><br>
                  <br>
                </blockquote>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div></div></div>