<div dir="ltr"><div>Thank you for giving this this much thought!!! I reply somewhat slowly because I'm trying to keep things fairly formal and precise, an area in which I still have to improve on.</div><div dir="ltr"><div class="gmail_quote"><div><div><br></div><div>So, in short, Example 1. was a poor choice. You're right, the <i>value </i>of <font face="monospace, monospace"><dynamically allocated int object></font> isn't affected by what points to it -- using the traditional Weiser algorithm wouldn't contain statement 5 and 6. In a sense, your mutex code is similar to Example 1, we're not interested, at least in this case, what the actual value of the mutex is.</div></div><div><br></div><div>For now, I'm only planning to use this technique to reason about <i>values </i>of certain variables. Example 1. wanted to demonstrate how BugReporter is prone to make a too short of a bugpath -- however, the following example can show this as well:<br><br></div><div><b>asd.cpp</b></div><div><div><font face="monospace, monospace">1  void useInt(int);</font></div><div><font face="monospace, monospace">2  </font></div><div><font face="monospace, monospace">3  int getInt(int x) {</font></div><div><font face="monospace, monospace">4    int a;</font></div><div><font face="monospace, monospace">5</font></div><div><font face="monospace, monospace">6    if (x > 0)</font></div><div><font face="monospace, monospace">7      a = 3;</font></div><div><font face="monospace, monospace">8    else</font></div><div><font face="monospace, monospace">9      a = 2;</font></div><div><font face="monospace, monospace">10</font></div><div><font face="monospace, monospace">11   return a;</font></div><div><font face="monospace, monospace">12 }</font></div><div><font face="monospace, monospace">13</font></div><div><font face="monospace, monospace">14 int g();</font></div><div><font face="monospace, monospace">15</font></div><div><font face="monospace, monospace">16 int main() {</font></div><div><font face="monospace, monospace">17   int arr[10];</font></div><div><font face="monospace, monospace">18</font></div><div><font face="monospace, monospace">19   for (int i = 0; i < 3; ++i)</font></div><div><font face="monospace, monospace">20     arr[i] = 0;</font></div><div><font face="monospace, monospace">21</font></div><div><font face="monospace, monospace">22   int x = g();</font></div><div><font face="monospace, monospace">23   int n = getInt(x);</font></div><div><font face="monospace, monospace">24   useInt(arr[n]);</font></div><div><font face="monospace, monospace">25 }</font></div></div><div><font face="monospace, monospace"><br></font></div><div><div><img src="cid:ii_ju1o03pw0" alt="image.png" width="560" height="555"><br></div></div><div><br></div><div>The attached ExplodedGraph demonstrates that the analyzer assumed that<font face="monospace, monospace"> n == 3</font>, but the bugreport doesn't mention that.</div><div><br></div><div>Alright, with my <i>original point </i>being made properly, let's address the raised questions.</div><div><div><br></div></div><div><b>Q: </b>My primary question is how much do you think this will put stress on the checkers. Like, how much more difficult would it be to write checkers when we require them to include the slicing criterion with their bug reports? How much of it would we (i.e., BugReporter) be able to infer ourselves?</div><div><b><br></b></div><div><b>A: </b>As stated in this response, this approach would only consider the value of variables. This could essentially be regarded as a more general, better approach to <font face="monospace, monospace">bugreporter::trackNullOrUndefValue()</font> and similar functions. It would be great however to make this general enough to eventually also handle whatever properties (e.g. lockedness) regions might have. My worry is that it would offer little gain for a lot of chore, the current <font face="monospace, monospace">BugReporterVisitor</font> system (especially with your easier-to-implement <font face="monospace, monospace">NoteTag</font> system) gets the job done just fine. Most of it, as I know, boils down to "Let's find the specific <font face="monospace, monospace">ExplodedNode</font> where the property of this region was changed", and a simple traversal of the bugpath is all you need for that.</div><div><br></div><div>With that being said, I plan to construct the set of variables from the interesting symbol set, and this API is already present in the code. We'll see how that goes though :^).</div><div><br></div><div>However, liveness of variables is a very interesting topic. I really need to do some research on this before elaborating more, but I'll try to see how we could tackle this with backward slicing.</div><div><br><b>Q:</b> Instead we can also have a look at the execution path on the ExplodedGraph that goes through the true-branch and see if the value remains null when it exits the if-branch. Kristof, you're saying that you plan to do the slicing over the ExplodedGraph - is that what you mean? This could work as long as the other branch is actually *explored* by the analyzer. If it isn't, it'll be almost impossible to detect, and i don't know how often would this happen, but there's a good chance it's going to be much more rare than us having problems with such highlighting right now.</div><div><div><br></div><div>We can also come up with a completely separate CFG-based analysis for this purpose. This is probably the most precise thing to do, because the problem is essentially an all-paths problem (which is why loss of coverage in the ExplodedGraph would bite us). I cannot estimate how difficult such analysis would be to implement (probably pretty scary), but i think Kristof wasn't looking in this direction.<br></div><br class="gmail-Apple-interchange-newline"></div><div><b>A: </b>Well, I originally imagined it to implement this on the ExplodedGraph, but if achievable, a CFG based solution would indeed be the ideal thing. This is some great feedback -- Again, I'll follow up with this after some more research.<br><br></div><div>-----------------</div><div><br></div><div>In the meanwhile (pretty much before I'm done with the proposal), I won't have the time to interact with Phabricator much though :^)<br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 3 Apr 2019 at 23:40, Gábor Horváth <<a href="mailto:xazax.hun@gmail.com">xazax.hun@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>I just realized after reading your answer how ambiguous my email was. When I said it is possible to trick the NoStoreFuncVisitor what I meant was to prevent it from emitting notes and thus pruning interesting parts of the bug path. But I am glad that we are actually on the same page regarding the value of detecting control dependencies. The example 4 you provided is a very compelling one :)</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 3 Apr 2019 at 23:16, Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank">noqnoqneo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    The thing in NoStoreFuncVisitor for IVars looks like it's just
    saying "let's see if there's a data dependency on at least one
    statement in this call". Given that the statement found by such AST
    matcher was not executed on the bug's execution path, there must be
    at least one control flow dependency in this function as well.<br>
    <br>
    I think this pattern-matching may only have false negatives, i.e.
    it's possible to write into a variable/field/ivar without producing
    an assignement-operator to a DeclRefExpr for that
    variable/field/ivar, but it's not possible to write an assignment
    into a DeclRefExpr for that variable/field/ivar without modifying
    the variable/field/ivar in the process. A more sophisticated
    analysis may find situations when such write is always done via an
    aliasing pointer, but i cannot think of anything else. Well, there's
    also the self-assignment cornercase, i.e. `x = x`, but that's
    esoteric.<br>
    <br>
    ------------------------<br>
    <br>
    Ok, i think i understand: one of the great goals for this project
    would be to correctly highlight control flow dependencies, which we
    currently don't do. I.e.:<br>
    <br>
    <b>Example 4.</b><br>
    <br>
    01  int flag;<br>
    02  <br>
    03  bool coin();<br>
    04  <br>
    05  void foo() {<br>
    06    flag = coin();<br>
    07  }<br>
    08  <br>
    09  void bar() {<br>
    10    int *x = 0;<br>
    11    // Set the flag to true.<br>
    12    flag = true;<br>
    13    foo();<br>
    14    if (flag) { // Taking false branch... wait, what?<br>
    15      x = new int;<br>
    16    }<br>
    17    foo();<br>
    18    if (flag) { // Now it's taking true branch again?!<br>
    19      *x = 1; // Null dereference.<br>
    20    }<br>
    21  }<br>
    <br>
    Because we wouldn't highlight the data dependency on line 06 of the
    control flow dependency on lines 14 and 18, the positive may be
    incomprehensible.<br>
    <br>
    This one's tricky to do with a visitor because it's hard to figure
    out that the true-branch of the if-statement on line 14 would have
    updated `x`, given that we didn't execute it.<br>
    <br>
    (1) Again, we can do a syntactic match over the true-branch, like we
    did in NoStoreFuncVisitor, to see if there are any data depenencies
    within it (there are, line 15). Again, this may fail to cover the
    tricky cases (what if this branch instead calls a function that
    initializes `x`?). <br>
    <br>
    (2) Instead we can also have a look at the execution path on the
    ExplodedGraph that goes through the true-branch and see if the value
    remains null when it exits the if-branch. Kristof, you're saying
    that you plan to do the slicing over the ExplodedGraph - is that
    what you mean? This could work as long as the other branch is
    actually *explored* by the analyzer. If it isn't, it'll be almost
    impossible to detect, and i don't know how often would this happen,
    but there's a good chance it's going to be much more rare than us
    having problems with such highlighting right now.<br>
    <br>
    (3) We can also come up with a completely separate CFG-based
    analysis for this purpose. This is probably the most precise thing
    to do, because the problem is essentially an all-paths problem
    (which is why loss of coverage in the ExplodedGraph would bite us).
    I cannot estimate how difficult such analysis would be to implement
    (probably pretty scary), but i think Kristof wasn't looking in this
    direction.<br>
    <br>
    Yay, i finally understand how to solve these problems we have!<br>
    <br>
    I'd suggest first starting with a syntax-based approach (1) because
    it sounds to me that the approach that we chose for identifying
    dependencies on paths that aren't part of the bug path is orthogonal
    to actually making use of that information to produce notes. Once we
    learn how to make use of this information, we'll have a taste of
    this medicine and see if it works. Then we'll see if we need to
    transition to (2) or (3) depending on such evaluation.<br>
    <br>
    What i just described sounds like a fairly realistic plan to me. Of
    course, the question still remains about how checker-specific would
    the analysis be. If i learned anything, it's "don't estimate the
    power of the esoteric checker contracts". But the idea that we can
    often get away with just tracking interesting regions and symbols
    and consuming trackExpressionValue() hints sounds like a good thing
    to try.<br>
    <br>
    <br>
    <br>
    <div class="gmail-m_1040859510329759042gmail-m_3651988806800932803moz-cite-prefix">On 4/2/19 11:23 PM, Gábor Horváth
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>Yeah, this particular example would be fixed, but we do not
          know how noisy it would be in general. One way to make it a
          bit less noisy could be something like what the
          NoStoreFuncVisitor already does for IVars. We could also add
          some syntactic checks to see if the method/function actually
          has the possibility to escape the region. <br>
        </div>
        <div><br>
        </div>
        <div>I am not familiar with Obj-C, but I suspect it would be
          possible to trick NoStoreFuncVisitor, or at least the
          syntactic checks in `<span class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-pl-en">potentiallyWritesIntoIvar</span>
          `.</div>
        <div>I think the main value of slicing here could be to get rid
          of those fully syntactic heuristics and replace them by
          something smarter. Does this make sense?<br>
        </div>
        <br>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">On Wed, 3 Apr 2019 at 02:25,
            Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank">noqnoqneo@gmail.com</a>> wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
            <div bgcolor="#FFFFFF"> One more question about Example 1.
              Do you think this particular example could be solved by
              developing some sort of NoStoreFuncVisitor but for
              MallocChecker? I.e., in addition to emitting notes like
              "returning without initializing variable `x`", it could
              emit notes like "returning without escaping pointer `p`".
              If such visitor is developed, what would be the
              differences between your proposal and whatever behavior
              we'll get with such visitor? In other words, is it
              possible to construct an example with the "use of
              uninitialized variable" checker that, like Example 1, has
              vital pieces of information missing, given that this
              checker already uses NoStoreFuncVisitor?<br>
              <br>
              I mean, it's most likely a terrible idea to develop such
              visitor for MallocChecker, because the only reason this
              visitor works more or less reliably is the existence of
              const-qualifiers. For MallocChecker they don't exist, so
              it's going to be hard to figure out if a function was
              *supposed* to escape the pointer.<br>
              <br>
              <div class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944moz-cite-prefix">On
                4/2/19 5:14 PM, Artem Dergachev wrote:<br>
              </div>
              <blockquote type="cite"> Ahaaaaa, ooooookkkkk, iiiiiiii
                seeeeeeeeee!<br>
                <br>
                Sounds fascinating, must-have and hard.<br>
                <br>
                My primary question is how much do you think this will
                put stress on the checkers. Like, how much more
                difficult would it be to write checkers when we require
                them to include the slicing criterion with their bug
                reports? How much of it would we (i.e., BugReporter) be
                able to infer ourselves?<br>
                <br>
                ------------------------<br>
                <br>
                Say, let's take Example 1. You describe the slicing
                criterion as:<br>
                <br>
                    (13, <dynamically allocated int object>)<br>
                <br>
                The value of the dynamically allocated into object
                remains the same regardless of whether the object is
                stored in global_ptr or not, so the slice doesn't need
                to include line 5 or 6. Therefore i think that the
                slicing criterion you proposed is not what we're looking
                for, and the real slicing criterion we're looking for
                is:<br>
                <br>
                    (13, <liveness of <dynamically allocated int
                object>>)<br>
                <br>
                Like, we're mapping every dynamically allocated object
                to an imaginary heap location that represents its
                current liveness, and include that imaginary variable,
                rather than the object itself, in the slicing criterion.
                Line 6 would affect liveness of the object on line 13
                (if it's stored into a global, it's alive; otherwise
                it's not), therefore it'll be part of the slice. So, am
                i understanding correctly that you're proposing a
                checker API that'll look like this:<br>
                <br>
                  BugReport *R = new BugReport(Node, ...);<br>
                  R->addLivenessBasedSlicingCriterion(HeapSym);<br>
                <br>
                ?<br>
                <br>
                I guess we can infer the statement of the slicing
                criterion from the Node, but i'm not entirely sure; see
                also below.<br>
                <br>
                I'd actually love it if you elaborate this example
                further because it's fairly interesting. Like, we know
                that the assignment affects the liveness information,
                but how would the slicing algorithm figure this out? Do
                you have a step-by-step description of how the algorithm
                behaves in this case?<br>
                <br>
                ------------------------<br>
                <br>
                Let's take another example i just came up with. Consider
                alpha.unix.PthreadLock - a checker that finds various
                bugs with mutexes, such as double locks or double
                unlocks. Consider code:<br>
                <br>
                1  pthread_mutex_t mtx1, mtx2;<br>
                2<br>
                3  void foo(bool x1, bool x2) {<br>
                4    if (x1)<br>
                5      pthread_mutex_lock(&mtx1);<br>
                6    if (x2)<br>
                7      pthread_mutex_lock(&mtx1);<br>
                8    // ...<br>
                9  }<br>
                <br>
                In this example we'll report a double lock bug due to a
                copy-paste error: line 7 should probably lock &mtx2
                rather than &mtx1.<br>
                <br>
                The whole program is relevant to the report. Without
                line 5, there would only be a single lock. It
                transitions the mutex object from state "unknown" to
                state "locked".<br>
                <br>
                I don't think the Analyzer would currently do a bad job
                at explaining the bug; it's pretty straightforward.<br>
                <br>
                What would be the slicing criterion in this case? As far
                as i understand, it'll be<br>
                <br>
                    (7, <locked-ness of mtx1>)<br>
                <br>
                In this case "lockedness" is, again, an "imaginary heap
                location" that contains metadata for the mutex. More
                specifically, it is a GDM map value. How would a checker
                API look in this case? How would it even describe to the
                BugReporter what to look at, given that currently only
                the checker understands how to read its GDM values?<br>
                <br>
                My random guess is:<br>
                <br>
                  BugReport *R = new BugReport(Node, ...);<br>
                 
                R->addGDMBasedSlicingCriterion<MutexState>(MutexMR);<br>
                <br>
                ------------------------<br>
                <br>
                So it sounds to me that your project is, like many other
                awesome projects, brings in, as a dependency, replacing
                GDM with a better, smarter data map that the analyzer
                core *can* introspect and manipulate without the direct
                help from the checkers. It might be that your project
                actually requires relatively little amounts of such
                introspection - i.e., you might be able to get away with
                "it's some GDM map and the value for this key has
                changed, therefore this statement is a relevant data
                dependency" for quite a lot of checkers.<br>
                <br>
                That's the subject i was also bringing up as part of <a class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944moz-txt-link-freetext" href="https://reviews.llvm.org/D59861" target="_blank">https://reviews.llvm.org/D59861</a>
                - "this was the plan that is part of the
                even-more-long-term plan of completely deprecating the
                void*-based Generic Data Map...". This was also
                something that became a blocker on my old attempt to
                introduce summary-based analysis. It required stuffing
                fairly large chunks of code into every checker that
                would teach the checker how to collect summary
                information and apply it when the summary is applied,
                and this new boilerplate was fairly brain-damaging to
                implement and hard to get right or even to explain. This
                problem also blocks other projects, such as state
                merging / loop widening (on one branch the mutex is
                locked, on the other branch it is unlocked, hey checker
                please teach me how to merge this).<br>
                <br>
                The variety of slicing criteria (that is a consequence
                of the variety of checkers that we have) sounds pretty
                scary to me. Some of them are really complicated, eg.
                the liveness criterion. Making sure that the generic
                algorithm works correctly with at least a significant
                chunk of them is going to be fun. Making sure it can
                actually handle arbitrary criterions required by the
                checkers without having every checker come with its own
                boilerplate for slicing also sounds hard. And it'll be
                very bad if we have to tell our checker developers "hey,
                by the way, you also need to know what backward slicing
                is and write down part of the algorithm in order to ever
                get your checker enabled by default" - i'm already
                feeling pretty bad explaining dead symbols and pointer
                escapes (which are pretty much must-have in most
                checkers), these are definitely examples of a
                boilerplate that a smarter version of GDM could handle
                automatically. In order to conquer the world, i think we
                should stick to our "writing a checker in 24 hours"
                utopia: writing a checker should be as easy as writing
                down the transfer functions for relevant statements. In
                my opinion, we should take it as an important
                constraint.<br>
                <br>
                <br>
                <br>
                <br>
                <div class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944moz-cite-prefix">On
                  4/2/19 12:21 PM, Kristóf Umann wrote:<br>
                </div>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div dir="ltr">Somehow the images and the attached
                      files were left out, please find them here:</div>
                    <br>
                    <div class="gmail_quote">
                      <div dir="ltr" class="gmail_attr">On Tue, 2 Apr
                        2019 at 21:16, Kristóf Umann <<a href="mailto:dkszelethus@gmail.com" target="_blank">dkszelethus@gmail.com</a>>
                        wrote:<br>
                      </div>
                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                        <div dir="ltr">
                          <div class="gmail_quote">
                            <div dir="ltr">
                              <div dir="ltr">
                                <div dir="ltr">
                                  <div dir="ltr"><span style="background-color:rgb(255,255,255)"><font color="#000000">Hi!<br>
                                        <br>
                                      </font></span>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">In this
                                          letter, I'd like to describe a
                                          particular problem in the
                                          Clang StaticAnalyzer's <font face="monospace, monospace">BugReporter
                                          </font>class that I'd like to
                                          tackle in a Google Summer of
                                          Code project this summer. I'll
                                          show real-world examples on
                                          some of its shortcomings, and
                                          propose a potential solution
                                          using static backward program
                                          slicing. At last, I'll ask
                                          some specific questions. I'd
                                          love to hear any and all
                                          feedback on this!</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">This is a <i>problem
                                            statement, </i>not a <i>proposal</i>.
                                          I plan to send out a formal
                                          proposal by Friday (but not
                                          later then Saturday), that
                                          will contain more details on
                                          both the problem and the
                                          solution. I don't introduce
                                          myself or my previous works
                                          within the project, that will
                                          also be detailed in my
                                          upcoming letter. I also plan
                                          to incorporate the feedback
                                          I'll receive to this letter.</font></span></div>
                                  </div>
                                  <div dir="ltr"><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                      </font></span></div>
                                  <div><b><font style="background-color:rgb(255,255,255)" size="4" color="#000000">---===
                                        BugReporter constructs bad
                                        reports ===---</font></b></div>
                                  <div><b><font style="background-color:rgb(255,255,255)" size="4" color="#000000"><br>
                                      </font></b></div>
                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><b>What does the
                                          BugReporter do?</b><br>
                                        <br>
                                        After the Static Analyzer found
                                        an error, the <font face="monospace, monospace">BugReporter
                                        </font>receives an <font face="monospace, monospace">ExplodedNode</font>,
                                        which, accompanied by its
                                        predecessors, contains all the
                                        information needed to reproduce
                                        that error. This <i>bugpath </i>is
                                        then shortened with a variety of
                                        heuristics, such as removing
                                        unrelated function calls,
                                        unrelated branches and so on. <font face="monospace, monospace">BugReporter
                                        </font>by the end of this
                                        process will construct a <font face="monospace, monospace">PathDiagnostic
                                        </font>object for each report,
                                        that is, ideally, minimal.</font></span></div>
                                  <div><b style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                      </font></b></div>
                                  <div><b style="background-color:rgb(255,255,255)"><font color="#000000">Example 1.</font></b></div>
                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                      </font></span></div>
                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Consider the
                                        following code example:<br>
                                        <br>
                                      </font></span>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">1  // leak.cpp</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">2  int
                                        *global_ptr;</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">3  </font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">4  void
                                        save_ext(int storeIt, int *ptr)
                                        {</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">5    if
                                        (storeIt)</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">6     
                                        global_ptr = ptr;</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">7  }</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">8</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">9  void test(int
                                        b) {</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">10   int *myptr
                                        = new int;</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">11   save_ext(b,
                                        myptr);</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">12   delete
                                        global_ptr;</font></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">13 }</font></div>
                                  </div>
                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                      </font></span></div>
                                  <div dir="ltr">
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">It's clear
                                          that if test is invoked with <font face="monospace, monospace">b</font>'s
                                          value set to true, there is no
                                          error in the program. However,
                                          should b be false, we'll leak
                                          memory.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <span style="font-family:monospace,monospace;background-color:rgb(255,255,255)"><font color="#000000">$ clang -cc1
                                        -analyze
                                        -analyzer-checker=core,cplusplus
                                        leak.cpp</font></span></div>
                                  <div dir="ltr">
                                    <div><img src="cid:169e5205a23cb971f161" alt="image.png" width="472" height="168"><br>
                                    </div>
                                  </div>
                                  <div dir="ltr"><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000"><br>
                                    </font>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">The Static
                                          Analyzer is able to catch this
                                          error, but fails to mention
                                          the call to <font face="monospace, monospace">save_ext</font> entirely,
                                          despite the error only
                                          occurring because the analyzer
                                          assumed that <font face="monospace, monospace">storeIt</font> is
                                          false. I've also attached the
                                          exploded graph <font face="monospace, monospace">leak.svg </font><font face="arial, helvetica,
                                            sans-serif">that
                                            demonstrates this.</font></font></span></div>
                                    <div><font style="background-color:rgb(255,255,255)" face="arial, helvetica,
                                        sans-serif" color="#000000"><br>
                                      </font></div>
                                    <div>
                                      <div><b style="background-color:rgb(255,255,255)"><font color="#000000">Example 2.</font></b></div>
                                    </div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Consider the
                                          following code example:<br>
                                          <br>
                                          <font face="monospace,
                                            monospace">1  //
                                            divbyzero.cpp</font><br>
                                        </font></span>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">2  void f() {</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">3    int i =
                                          0;</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">4    (void)
                                          (10 / i);</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">5  }</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">6</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">7  void g() {
                                          f(); }</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">8  void h() {
                                          g(); }</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">9  void j() {
                                          h(); }</font></div>
                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">10 void k() {
                                          j(); }</font></div>
                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944m_6421736142923902491gmail-m_-3454961580602141710gmail-m_-6979117027997739172gmail-m_4426039388547674777gmail-m_5439174890474816058gmail-m_6038659214782379119gmail-Apple-interchange-newline">
                                            Its clear that a call to f
                                            will result in a division by
                                            zero error.</font></span></div>
                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                          </font></span></div>
                                      <span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-family:monospace,monospace">$
                                            clang -cc1 -analyze
                                            -analyzer-checker=core </span><span style="font-family:monospace,monospace">divbyzero.cpp</span></font></span></div>
                                    <div><img src="cid:169e5205a24cb971f162" alt="image.png" width="334" height="472"><br>
                                    </div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Again, the
                                          Static Analyzer is plenty
                                          smart enough to catch this,
                                          but the constructed bug report
                                          is littered with a lot of
                                          useless information -- it
                                          would be enough to only show
                                          the body of f, and, optionally
                                          where f itself was called. For
                                          the sake of completeness, I've
                                          attached <font face="monospace, monospace">divbyzero.svg</font> that
                                          contains the exploded graph
                                          for the above code snippet.<br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">The above
                                          examples demonstrate that <font face="monospace, monospace">BugReporter
                                          </font>sometimes reduces the
                                          bugpath too much or too
                                          little.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><b><font style="background-color:rgb(255,255,255)" size="4" color="#000000">---===
                                          Solving the above problem with
                                          static backward program
                                          slicing ===---</font></b></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><b style="background-color:rgb(255,255,255)"><font color="#000000">What is static
                                          backward program slicing?</font></b></div>
                                    <div>
                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                          </font></span></div>
                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000">A <i>program
                                              slice</i> consists of the
                                            parts of a program that
                                            (potentially) affect the
                                            values computed at some
                                            point of interest, called
                                            the slicing criterion. <i><b>Program
                                                slicing</b></i> is a
                                            decomposition technique that
                                            elides program components
                                            not relevant to the slicing
                                            criterion (which is a pair
                                            of (statement, set of
                                            variables)), creating a
                                            program slice[1][2]. <b>Static </b>slicing
                                            preserves the meaning of the
                                            variable(s) in the slicing
                                            criterion for all possible
                                            inputs[1]. <b>Backward </b>slices
                                            answer the question “what
                                            program components might
                                            affect a selected
                                            computation?”[1]</font></span></div>
                                    </div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">While
                                          statement-minimal slices are
                                          not necessarily unique[3],
                                          Weisel developed a popular
                                          algorithm that constructs one.
                                          In essence, his fix-point
                                          algorithm constructs sets of <i>relevant
                                            variables</i><i> </i>for
                                          each edge in between node <i>i</i> and
                                          node <i>j</i> in a CFG graph,
                                          from which he constructs <i>relevant
                                            statements</i>. The
                                          fix-point of the relevant
                                          statements set is the slice
                                          itself.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">One of the
                                          characteristic of his
                                          algorithm is that the
                                          resulting program slice will
                                          be executable. However, our
                                          problem doesn't require the
                                          code to be executable, so we
                                          could use a more "aggressive"
                                          approach that creates a
                                          smaller slice. An improvement
                                          to his algorithm is presented
                                          in [4].</font></span></div>
                                    <div>
                                      <div><b style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                          </font></b></div>
                                      <div><b style="background-color:rgb(255,255,255)"><font color="#000000">How does
                                            this relate to BugReporter?</font></b></div>
                                    </div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">We can show
                                          that using Weiser's algorithm,
                                          issues raised in Example 1.
                                          and Example 2. can be improved
                                          upon.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">For example
                                          1., the statement-minimal
                                          program slice with the
                                          criterion (13, <font face="monospace, monospace"><dynamically
                                            allocated int object></font>)
                                          will contain the statements 5
                                          and 6, and for example 2., the
                                          statement-minimal program
                                          slice with the criterion (4,<font face="monospace, monospace">
                                            i</font>) won't contain
                                          anything but statements 3 and
                                          4. For the latter, we can even
                                          improve the algorithm to also
                                          contain statement 7, where a
                                          call to f is made.</font></span></div>
                                    <span style="background-color:rgb(255,255,255)"><font color="#000000"><br class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944m_6421736142923902491gmail-m_-3454961580602141710gmail-m_-6979117027997739172gmail-m_4426039388547674777gmail-Apple-interchange-newline">
                                        The analyzer, as stated earlier,
                                        gives <font face="monospace,
                                          monospace">BugReporter </font>an
                                        <font face="monospace,
                                          monospace">ExplodedNode</font>,
                                        from which the slicing criterion
                                        must be constructed. The
                                        statement corresponding to this
                                        node, coupled with the
                                        interesting regions the checker
                                        itself marked could be used to
                                        construct this slicing
                                        criterion.</font></span>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><b>Challenges</b><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">While the
                                          algorithm Weiser developed
                                          along with the improvements
                                          made by others are
                                          interprocedural, I would
                                          imagine that in
                                          implementation, it would be a
                                          challenging step from an
                                          intraprocedural prototype.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Several
                                          articles also describe
                                          pointers, references, and
                                          dynamically allocated regions,
                                          as well as gotos and other
                                          tricky parts of the language,
                                          but I still expect to see some
                                          skeletons falling out of the
                                          closet when implementing this
                                          for C++, not only C.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><b style="background-color:rgb(255,255,255)"><font color="#000000">Drawbacks</font></b></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Static
                                          slicing, as an algorithm that
                                          doesn't know anything about
                                          input values, suffers from the
                                          same issues that all static
                                          techniques do, meaning that
                                          without heuristics, it'll have
                                          to do very rough guesses,
                                          possibly leaving a far too big
                                          program slice. However, with
                                          the symbolic values the
                                          analyzer holds, this could be
                                          improved, turning this into <i>conditioned
                                            slicing, </i>as described in
                                          [1]. This however is only
                                          planned as a followup work
                                          after GSoC.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">For this
                                          reason, this project would be
                                          developed as an alternative
                                          approach to some of the
                                          techniques used in <font face="monospace, monospace">BugReporter</font>,
                                          as an optional off-by-default
                                          analyzer feature.</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><font size="4" color="#000000"><b style="background-color:rgb(255,255,255)">---=== Questions ===---</b></font></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">What do you
                                          think of this approach?</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Do you think
                                          that implementing this
                                          algorithm is achievable, but
                                          tough enough task for GSoC?</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Would you
                                          prefer to see a general
                                          program slicing library, or an
                                          analyzer-specific
                                          implementation? Traversing the
                                          <font face="monospace,
                                            monospace">ExplodedGraph </font>would
                                          be far easier in terms of what
                                          I want to achieve, but a more
                                          general approach that
                                          traverses the CFG (like <font face="monospace, monospace">llvm::DominatorTree</font><font face="arial, helvetica,
                                            sans-serif">[5]) could be
                                            beneficial to more
                                            developers, but possibly at
                                            the cost of not being able
                                            to improve the prototype
                                            with the symbolic value
                                            information the analyzer
                                            holds.</font></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><b><i style="background-color:rgb(255,255,255)"><font color="#000000">References,
                                            links</font></i></b></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">[1] <span style="font-size:13px;font-family:Arial,sans-serif">Gallagher,
                                            Keith, and David Binkley.
                                            "Program slicing." </span><i style="font-size:13px;font-family:Arial,sans-serif">2008 Frontiers of
                                            Software Maintenance</i><span style="font-size:13px;font-family:Arial,sans-serif">.</span></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[2] </span><span style="font-size:13px;font-family:Arial,sans-serif">Tip, Frank. </span><i style="font-size:13px;font-family:Arial,sans-serif">A survey of program
                                            slicing techniques</i><span style="font-size:13px;font-family:Arial,sans-serif">. Centrum voor
                                            Wiskunde en Informatica,
                                            1994.</span></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[3] </span><span style="font-size:13px;font-family:Arial,sans-serif">Weiser, Mark.
                                            "Program slicing." </span><i style="font-size:13px;font-family:Arial,sans-serif">Proceedings of the
                                            5th international conference
                                            on Software engineering</i><span style="font-size:13px;font-family:Arial,sans-serif">. IEEE Press, 1981.</span></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[4] </span><span style="font-size:13px;font-family:Arial,sans-serif">Binkley, David.
                                            "Precise executable
                                            interprocedural slices." </span><i style="font-size:13px;font-family:Arial,sans-serif">ACM Letters on
                                            Programming Languages and
                                            Systems (LOPLAS)</i><span style="font-size:13px;font-family:Arial,sans-serif"> 2.1-4
                                            (1993): 31-45.</span></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[5] </span><font face="Arial, sans-serif"><a href="http://llvm.org/doxygen/classllvm_1_1DominatorTree.html" target="_blank">http://llvm.org/doxygen/classllvm_1_1DominatorTree.html</a></font></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Link to
                                          previous GSoC related letter I
                                          sent: <a href="http://lists.llvm.org/pipermail/cfe-dev/2019-February/061464.html" target="_blank">http://lists.llvm.org/pipermail/cfe-dev/2019-February/061464.html</a></font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>
                                        </font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Cheers,</font></span></div>
                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Kristóf Umann</font></span></div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                  </div>
                </blockquote>
                <br>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div></div>
</blockquote></div>