<div dir="ltr"><div>Thank you for giving this this much thought!!! I reply somewhat slowly because I'm trying to keep things fairly formal and precise, an area in which I still have to improve on.</div><div dir="ltr"><div class="gmail_quote"><div><div><br></div><div>So, in short, Example 1. was a poor choice. You're right, the <i>value </i>of <font face="monospace, monospace"><dynamically allocated int object></font> isn't affected by what points to it -- using the traditional Weiser algorithm wouldn't contain statement 5 and 6. In a sense, your mutex code is similar to Example 1, we're not interested, at least in this case, what the actual value of the mutex is.</div></div><div><br></div><div>For now, I'm only planning to use this technique to reason about <i>values </i>of certain variables. Example 1. wanted to demonstrate how BugReporter is prone to make a too short of a bugpath -- however, the following example can show this as well:<br><br></div><div><b>asd.cpp</b></div><div><div><font face="monospace, monospace">1  void useInt(int);</font></div><div><font face="monospace, monospace">2  </font></div><div><font face="monospace, monospace">3  int getInt(int x) {</font></div><div><font face="monospace, monospace">4    int a;</font></div><div><font face="monospace, monospace">5</font></div><div><font face="monospace, monospace">6    if (x > 0)</font></div><div><font face="monospace, monospace">7      a = 3;</font></div><div><font face="monospace, monospace">8    else</font></div><div><font face="monospace, monospace">9      a = 2;</font></div><div><font face="monospace, monospace">10</font></div><div><font face="monospace, monospace">11   return a;</font></div><div><font face="monospace, monospace">12 }</font></div><div><font face="monospace, monospace">13</font></div><div><font face="monospace, monospace">14 int g();</font></div><div><font face="monospace, monospace">15</font></div><div><font face="monospace, monospace">16 int main() {</font></div><div><font face="monospace, monospace">17   int arr[10];</font></div><div><font face="monospace, monospace">18</font></div><div><font face="monospace, monospace">19   for (int i = 0; i < 3; ++i)</font></div><div><font face="monospace, monospace">20     arr[i] = 0;</font></div><div><font face="monospace, monospace">21</font></div><div><font face="monospace, monospace">22   int x = g();</font></div><div><font face="monospace, monospace">23   int n = getInt(x);</font></div><div><font face="monospace, monospace">24   useInt(arr[n]);</font></div><div><font face="monospace, monospace">25 }</font></div></div><div><font face="monospace, monospace"><br></font></div><div><div><img src="cid:ii_ju1o03pw0" alt="image.png" width="560" height="555"><br></div></div><div><br></div><div>The attached ExplodedGraph demonstrates that the analyzer assumed that<font face="monospace, monospace"> n == 3</font>, but the bugreport doesn't mention that.</div><div><br></div><div>Alright, with my <i>original point </i>being made properly, let's address the raised questions.</div><div><div><br></div></div><div><b>Q: </b>My primary question is how much do you think this will put stress on the checkers. Like, how much more difficult would it be to write checkers when we require them to include the slicing criterion with their bug reports? How much of it would we (i.e., BugReporter) be able to infer ourselves?</div><div><b><br></b></div><div><b>A: </b>As stated in this response, this approach would only consider the value of variables. This could essentially be regarded as a more general, better approach to <font face="monospace, monospace">bugreporter::trackNullOrUndefValue()</font> and similar functions. It would be great however to make this general enough to eventually also handle whatever properties (e.g. lockedness) regions might have. My worry is that it would offer little gain for a lot of chore, the current <font face="monospace, monospace">BugReporterVisitor</font> system (especially with your easier-to-implement <font face="monospace, monospace">NoteTag</font> system) gets the job done just fine. Most of it, as I know, boils down to "Let's find the specific <font face="monospace, monospace">ExplodedNode</font> where the property of this region was changed", and a simple traversal of the bugpath is all you need for that.</div><div><br></div><div>With that being said, I plan to construct the set of variables from the interesting symbol set, and this API is already present in the code. We'll see how that goes though :^).</div><div><br></div><div>However, liveness of variables is a very interesting topic. I really need to do some research on this before elaborating more, but I'll try to see how we could tackle this with backward slicing.</div><div><br><b>Q:</b> Instead we can also have a look at the execution path on the ExplodedGraph that goes through the true-branch and see if the value remains null when it exits the if-branch. Kristof, you're saying that you plan to do the slicing over the ExplodedGraph - is that what you mean? This could work as long as the other branch is actually *explored* by the analyzer. If it isn't, it'll be almost impossible to detect, and i don't know how often would this happen, but there's a good chance it's going to be much more rare than us having problems with such highlighting right now.</div><div><div><br></div><div>We can also come up with a completely separate CFG-based analysis for this purpose. This is probably the most precise thing to do, because the problem is essentially an all-paths problem (which is why loss of coverage in the ExplodedGraph would bite us). I cannot estimate how difficult such analysis would be to implement (probably pretty scary), but i think Kristof wasn't looking in this direction.<br></div><br class="gmail-Apple-interchange-newline"></div><div><b>A: </b>Well, I originally imagined it to implement this on the ExplodedGraph, but if achievable, a CFG based solution would indeed be the ideal thing. This is some great feedback -- Again, I'll follow up with this after some more research.<br><br></div><div>-----------------</div><div><br></div><div>In the meanwhile (pretty much before I'm done with the proposal), I won't have the time to interact with Phabricator much though :^)<br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 3 Apr 2019 at 23:40, Gábor Horváth <<a href="mailto:xazax.hun@gmail.com">xazax.hun@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>I just realized after reading your answer how ambiguous my email was. When I said it is possible to trick the NoStoreFuncVisitor what I meant was to prevent it from emitting notes and thus pruning interesting parts of the bug path. But I am glad that we are actually on the same page regarding the value of detecting control dependencies. The example 4 you provided is a very compelling one :)</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 3 Apr 2019 at 23:16, Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank">noqnoqneo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    The thing in NoStoreFuncVisitor for IVars looks like it's just

    saying "let's see if there's a data dependency on at least one

    statement in this call". Given that the statement found by such AST

    matcher was not executed on the bug's execution path, there must be

    at least one control flow dependency in this function as well.<br>

    <br>

    I think this pattern-matching may only have false negatives, i.e.

    it's possible to write into a variable/field/ivar without producing

    an assignement-operator to a DeclRefExpr for that

    variable/field/ivar, but it's not possible to write an assignment

    into a DeclRefExpr for that variable/field/ivar without modifying

    the variable/field/ivar in the process. A more sophisticated

    analysis may find situations when such write is always done via an

    aliasing pointer, but i cannot think of anything else. Well, there's

    also the self-assignment cornercase, i.e. `x = x`, but that's

    esoteric.<br>

    <br>

    ------------------------<br>

    <br>

    Ok, i think i understand: one of the great goals for this project

    would be to correctly highlight control flow dependencies, which we

    currently don't do. I.e.:<br>

    <br>

    <b>Example 4.</b><br>

    <br>

    01  int flag;<br>

    02  <br>

    03  bool coin();<br>

    04  <br>

    05  void foo() {<br>

    06    flag = coin();<br>

    07  }<br>

    08  <br>

    09  void bar() {<br>

    10    int *x = 0;<br>

    11    // Set the flag to true.<br>

    12    flag = true;<br>

    13    foo();<br>

    14    if (flag) { // Taking false branch... wait, what?<br>

    15      x = new int;<br>

    16    }<br>

    17    foo();<br>

    18    if (flag) { // Now it's taking true branch again?!<br>

    19      *x = 1; // Null dereference.<br>

    20    }<br>

    21  }<br>

    <br>

    Because we wouldn't highlight the data dependency on line 06 of the

    control flow dependency on lines 14 and 18, the positive may be

    incomprehensible.<br>

    <br>

    This one's tricky to do with a visitor because it's hard to figure

    out that the true-branch of the if-statement on line 14 would have

    updated `x`, given that we didn't execute it.<br>

    <br>

    (1) Again, we can do a syntactic match over the true-branch, like we

    did in NoStoreFuncVisitor, to see if there are any data depenencies

    within it (there are, line 15). Again, this may fail to cover the

    tricky cases (what if this branch instead calls a function that

    initializes `x`?). <br>

    <br>

    (2) Instead we can also have a look at the execution path on the

    ExplodedGraph that goes through the true-branch and see if the value

    remains null when it exits the if-branch. Kristof, you're saying

    that you plan to do the slicing over the ExplodedGraph - is that

    what you mean? This could work as long as the other branch is

    actually *explored* by the analyzer. If it isn't, it'll be almost

    impossible to detect, and i don't know how often would this happen,

    but there's a good chance it's going to be much more rare than us

    having problems with such highlighting right now.<br>

    <br>

    (3) We can also come up with a completely separate CFG-based

    analysis for this purpose. This is probably the most precise thing

    to do, because the problem is essentially an all-paths problem

    (which is why loss of coverage in the ExplodedGraph would bite us).

    I cannot estimate how difficult such analysis would be to implement

    (probably pretty scary), but i think Kristof wasn't looking in this

    direction.<br>

    <br>

    Yay, i finally understand how to solve these problems we have!<br>

    <br>

    I'd suggest first starting with a syntax-based approach (1) because

    it sounds to me that the approach that we chose for identifying

    dependencies on paths that aren't part of the bug path is orthogonal

    to actually making use of that information to produce notes. Once we

    learn how to make use of this information, we'll have a taste of

    this medicine and see if it works. Then we'll see if we need to

    transition to (2) or (3) depending on such evaluation.<br>

    <br>

    What i just described sounds like a fairly realistic plan to me. Of

    course, the question still remains about how checker-specific would

    the analysis be. If i learned anything, it's "don't estimate the

    power of the esoteric checker contracts". But the idea that we can

    often get away with just tracking interesting regions and symbols

    and consuming trackExpressionValue() hints sounds like a good thing

    to try.<br>

    <br>

    <br>

    <br>

    <div class="gmail-m_1040859510329759042gmail-m_3651988806800932803moz-cite-prefix">On 4/2/19 11:23 PM, Gábor Horváth

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div>Yeah, this particular example would be fixed, but we do not

          know how noisy it would be in general. One way to make it a

          bit less noisy could be something like what the

          NoStoreFuncVisitor already does for IVars. We could also add

          some syntactic checks to see if the method/function actually

          has the possibility to escape the region. <br>

        </div>

        <div><br>

        </div>

        <div>I am not familiar with Obj-C, but I suspect it would be

          possible to trick NoStoreFuncVisitor, or at least the

          syntactic checks in `<span class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-pl-en">potentiallyWritesIntoIvar</span>

          `.</div>

        <div>I think the main value of slicing here could be to get rid

          of those fully syntactic heuristics and replace them by

          something smarter. Does this make sense?<br>

        </div>

        <br>

        <div class="gmail_quote">

          <div dir="ltr" class="gmail_attr">On Wed, 3 Apr 2019 at 02:25,

            Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank">noqnoqneo@gmail.com</a>> wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div bgcolor="#FFFFFF"> One more question about Example 1.

              Do you think this particular example could be solved by

              developing some sort of NoStoreFuncVisitor but for

              MallocChecker? I.e., in addition to emitting notes like

              "returning without initializing variable `x`", it could

              emit notes like "returning without escaping pointer `p`".

              If such visitor is developed, what would be the

              differences between your proposal and whatever behavior

              we'll get with such visitor? In other words, is it

              possible to construct an example with the "use of

              uninitialized variable" checker that, like Example 1, has

              vital pieces of information missing, given that this

              checker already uses NoStoreFuncVisitor?<br>

              <br>

              I mean, it's most likely a terrible idea to develop such

              visitor for MallocChecker, because the only reason this

              visitor works more or less reliably is the existence of

              const-qualifiers. For MallocChecker they don't exist, so

              it's going to be hard to figure out if a function was

              *supposed* to escape the pointer.<br>

              <br>

              <div class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944moz-cite-prefix">On

                4/2/19 5:14 PM, Artem Dergachev wrote:<br>

              </div>

              <blockquote type="cite"> Ahaaaaa, ooooookkkkk, iiiiiiii

                seeeeeeeeee!<br>

                <br>

                Sounds fascinating, must-have and hard.<br>

                <br>

                My primary question is how much do you think this will

                put stress on the checkers. Like, how much more

                difficult would it be to write checkers when we require

                them to include the slicing criterion with their bug

                reports? How much of it would we (i.e., BugReporter) be

                able to infer ourselves?<br>

                <br>

                ------------------------<br>

                <br>

                Say, let's take Example 1. You describe the slicing

                criterion as:<br>

                <br>

                    (13, <dynamically allocated int object>)<br>

                <br>

                The value of the dynamically allocated into object

                remains the same regardless of whether the object is

                stored in global_ptr or not, so the slice doesn't need

                to include line 5 or 6. Therefore i think that the

                slicing criterion you proposed is not what we're looking

                for, and the real slicing criterion we're looking for

                is:<br>

                <br>

                    (13, <liveness of <dynamically allocated int

                object>>)<br>

                <br>

                Like, we're mapping every dynamically allocated object

                to an imaginary heap location that represents its

                current liveness, and include that imaginary variable,

                rather than the object itself, in the slicing criterion.

                Line 6 would affect liveness of the object on line 13

                (if it's stored into a global, it's alive; otherwise

                it's not), therefore it'll be part of the slice. So, am

                i understanding correctly that you're proposing a

                checker API that'll look like this:<br>

                <br>

                  BugReport *R = new BugReport(Node, ...);<br>

                  R->addLivenessBasedSlicingCriterion(HeapSym);<br>

                <br>

                ?<br>

                <br>

                I guess we can infer the statement of the slicing

                criterion from the Node, but i'm not entirely sure; see

                also below.<br>

                <br>

                I'd actually love it if you elaborate this example

                further because it's fairly interesting. Like, we know

                that the assignment affects the liveness information,

                but how would the slicing algorithm figure this out? Do

                you have a step-by-step description of how the algorithm

                behaves in this case?<br>

                <br>

                ------------------------<br>

                <br>

                Let's take another example i just came up with. Consider

                alpha.unix.PthreadLock - a checker that finds various

                bugs with mutexes, such as double locks or double

                unlocks. Consider code:<br>

                <br>

                1  pthread_mutex_t mtx1, mtx2;<br>

                2<br>

                3  void foo(bool x1, bool x2) {<br>

                4    if (x1)<br>

                5      pthread_mutex_lock(&mtx1);<br>

                6    if (x2)<br>

                7      pthread_mutex_lock(&mtx1);<br>

                8    // ...<br>

                9  }<br>

                <br>

                In this example we'll report a double lock bug due to a

                copy-paste error: line 7 should probably lock &mtx2

                rather than &mtx1.<br>

                <br>

                The whole program is relevant to the report. Without

                line 5, there would only be a single lock. It

                transitions the mutex object from state "unknown" to

                state "locked".<br>

                <br>

                I don't think the Analyzer would currently do a bad job

                at explaining the bug; it's pretty straightforward.<br>

                <br>

                What would be the slicing criterion in this case? As far

                as i understand, it'll be<br>

                <br>

                    (7, <locked-ness of mtx1>)<br>

                <br>

                In this case "lockedness" is, again, an "imaginary heap

                location" that contains metadata for the mutex. More

                specifically, it is a GDM map value. How would a checker

                API look in this case? How would it even describe to the

                BugReporter what to look at, given that currently only

                the checker understands how to read its GDM values?<br>

                <br>

                My random guess is:<br>

                <br>

                  BugReport *R = new BugReport(Node, ...);<br>

                R->addGDMBasedSlicingCriterion<MutexState>(MutexMR);<br>

                <br>

                ------------------------<br>

                <br>

                So it sounds to me that your project is, like many other

                awesome projects, brings in, as a dependency, replacing

                GDM with a better, smarter data map that the analyzer

                core *can* introspect and manipulate without the direct

                help from the checkers. It might be that your project

                actually requires relatively little amounts of such

                introspection - i.e., you might be able to get away with

                "it's some GDM map and the value for this key has

                changed, therefore this statement is a relevant data

                dependency" for quite a lot of checkers.<br>

                <br>

                That's the subject i was also bringing up as part of <a class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944moz-txt-link-freetext" href="https://reviews.llvm.org/D59861" target="_blank">https://reviews.llvm.org/D59861</a>

                - "this was the plan that is part of the

                even-more-long-term plan of completely deprecating the

                void*-based Generic Data Map...". This was also

                something that became a blocker on my old attempt to

                introduce summary-based analysis. It required stuffing

                fairly large chunks of code into every checker that

                would teach the checker how to collect summary

                information and apply it when the summary is applied,

                and this new boilerplate was fairly brain-damaging to

                implement and hard to get right or even to explain. This

                problem also blocks other projects, such as state

                merging / loop widening (on one branch the mutex is

                locked, on the other branch it is unlocked, hey checker

                please teach me how to merge this).<br>

                <br>

                The variety of slicing criteria (that is a consequence

                of the variety of checkers that we have) sounds pretty

                scary to me. Some of them are really complicated, eg.

                the liveness criterion. Making sure that the generic

                algorithm works correctly with at least a significant

                chunk of them is going to be fun. Making sure it can

                actually handle arbitrary criterions required by the

                checkers without having every checker come with its own

                boilerplate for slicing also sounds hard. And it'll be

                very bad if we have to tell our checker developers "hey,

                by the way, you also need to know what backward slicing

                is and write down part of the algorithm in order to ever

                get your checker enabled by default" - i'm already

                feeling pretty bad explaining dead symbols and pointer

                escapes (which are pretty much must-have in most

                checkers), these are definitely examples of a

                boilerplate that a smarter version of GDM could handle

                automatically. In order to conquer the world, i think we

                should stick to our "writing a checker in 24 hours"

                utopia: writing a checker should be as easy as writing

                down the transfer functions for relevant statements. In

                my opinion, we should take it as an important

                constraint.<br>

                <br>

                <br>

                <br>

                <br>

                <div class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944moz-cite-prefix">On

                  4/2/19 12:21 PM, Kristóf Umann wrote:<br>

                </div>

                <blockquote type="cite">

                  <div dir="ltr">

                    <div dir="ltr">Somehow the images and the attached

                      files were left out, please find them here:</div>

                    <br>

                    <div class="gmail_quote">

                      <div dir="ltr" class="gmail_attr">On Tue, 2 Apr

                        2019 at 21:16, Kristóf Umann <<a href="mailto:dkszelethus@gmail.com" target="_blank">dkszelethus@gmail.com</a>>

                        wrote:<br>

                      </div>

                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                        <div dir="ltr">

                          <div class="gmail_quote">

                            <div dir="ltr">

                              <div dir="ltr">

                                <div dir="ltr">

                                  <div dir="ltr"><span style="background-color:rgb(255,255,255)"><font color="#000000">Hi!<br>

                                        <br>

                                      </font></span>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">In this

                                          letter, I'd like to describe a

                                          particular problem in the

                                          Clang StaticAnalyzer's <font face="monospace, monospace">BugReporter

                                          </font>class that I'd like to

                                          tackle in a Google Summer of

                                          Code project this summer. I'll

                                          show real-world examples on

                                          some of its shortcomings, and

                                          propose a potential solution

                                          using static backward program

                                          slicing. At last, I'll ask

                                          some specific questions. I'd

                                          love to hear any and all

                                          feedback on this!</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">This is a <i>problem

                                            statement, </i>not a <i>proposal</i>.

                                          I plan to send out a formal

                                          proposal by Friday (but not

                                          later then Saturday), that

                                          will contain more details on

                                          both the problem and the

                                          solution. I don't introduce

                                          myself or my previous works

                                          within the project, that will

                                          also be detailed in my

                                          upcoming letter. I also plan

                                          to incorporate the feedback

                                          I'll receive to this letter.</font></span></div>

                                  </div>

                                  <div dir="ltr"><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                      </font></span></div>

                                  <div><b><font style="background-color:rgb(255,255,255)" size="4" color="#000000">---===

                                        BugReporter constructs bad

                                        reports ===---</font></b></div>

                                  <div><b><font style="background-color:rgb(255,255,255)" size="4" color="#000000"><br>

                                      </font></b></div>

                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><b>What does the

                                          BugReporter do?</b><br>

                                        <br>

                                        After the Static Analyzer found

                                        an error, the <font face="monospace, monospace">BugReporter

                                        </font>receives an <font face="monospace, monospace">ExplodedNode</font>,

                                        which, accompanied by its

                                        predecessors, contains all the

                                        information needed to reproduce

                                        that error. This <i>bugpath </i>is

                                        then shortened with a variety of

                                        heuristics, such as removing

                                        unrelated function calls,

                                        unrelated branches and so on. <font face="monospace, monospace">BugReporter

                                        </font>by the end of this

                                        process will construct a <font face="monospace, monospace">PathDiagnostic

                                        </font>object for each report,

                                        that is, ideally, minimal.</font></span></div>

                                  <div><b style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                      </font></b></div>

                                  <div><b style="background-color:rgb(255,255,255)"><font color="#000000">Example 1.</font></b></div>

                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                      </font></span></div>

                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Consider the

                                        following code example:<br>

                                        <br>

                                      </font></span>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">1  // leak.cpp</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">2  int

                                        *global_ptr;</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">3  </font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">4  void

                                        save_ext(int storeIt, int *ptr)

                                        {</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">5    if

                                        (storeIt)</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">6     

                                        global_ptr = ptr;</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">7  }</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">8</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">9  void test(int

                                        b) {</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">10   int *myptr

                                        = new int;</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">11   save_ext(b,

                                        myptr);</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">12   delete

                                        global_ptr;</font></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">13 }</font></div>

                                  </div>

                                  <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                      </font></span></div>

                                  <div dir="ltr">

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">It's clear

                                          that if test is invoked with <font face="monospace, monospace">b</font>'s

                                          value set to true, there is no

                                          error in the program. However,

                                          should b be false, we'll leak

                                          memory.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <span style="font-family:monospace,monospace;background-color:rgb(255,255,255)"><font color="#000000">$ clang -cc1

                                        -analyze

                                        -analyzer-checker=core,cplusplus

                                        leak.cpp</font></span></div>

                                  <div dir="ltr">

                                    <div><img src="cid:169e5205a23cb971f161" alt="image.png" width="472" height="168"><br>

                                    </div>

                                  </div>

                                  <div dir="ltr"><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000"><br>

                                    </font>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">The Static

                                          Analyzer is able to catch this

                                          error, but fails to mention

                                          the call to <font face="monospace, monospace">save_ext</font> entirely,

                                          despite the error only

                                          occurring because the analyzer

                                          assumed that <font face="monospace, monospace">storeIt</font> is

                                          false. I've also attached the

                                          exploded graph <font face="monospace, monospace">leak.svg </font><font face="arial, helvetica,

                                            sans-serif">that

                                            demonstrates this.</font></font></span></div>

                                    <div><font style="background-color:rgb(255,255,255)" face="arial, helvetica,

                                        sans-serif" color="#000000"><br>

                                      </font></div>

                                    <div>

                                      <div><b style="background-color:rgb(255,255,255)"><font color="#000000">Example 2.</font></b></div>

                                    </div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Consider the

                                          following code example:<br>

                                          <br>

                                          <font face="monospace,

                                            monospace">1  //

                                            divbyzero.cpp</font><br>

                                        </font></span>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">2  void f() {</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">3    int i =

                                          0;</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">4    (void)

                                          (10 / i);</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">5  }</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">6</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">7  void g() {

                                          f(); }</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">8  void h() {

                                          g(); }</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">9  void j() {

                                          h(); }</font></div>

                                      <div><font style="background-color:rgb(255,255,255)" face="monospace, monospace" color="#000000">10 void k() {

                                          j(); }</font></div>

                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944m_6421736142923902491gmail-m_-3454961580602141710gmail-m_-6979117027997739172gmail-m_4426039388547674777gmail-m_5439174890474816058gmail-m_6038659214782379119gmail-Apple-interchange-newline">

                                            Its clear that a call to f

                                            will result in a division by

                                            zero error.</font></span></div>

                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                          </font></span></div>

                                      <span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-family:monospace,monospace">$

                                            clang -cc1 -analyze

                                            -analyzer-checker=core </span><span style="font-family:monospace,monospace">divbyzero.cpp</span></font></span></div>

                                    <div><img src="cid:169e5205a24cb971f162" alt="image.png" width="334" height="472"><br>

                                    </div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Again, the

                                          Static Analyzer is plenty

                                          smart enough to catch this,

                                          but the constructed bug report

                                          is littered with a lot of

                                          useless information -- it

                                          would be enough to only show

                                          the body of f, and, optionally

                                          where f itself was called. For

                                          the sake of completeness, I've

                                          attached <font face="monospace, monospace">divbyzero.svg</font> that

                                          contains the exploded graph

                                          for the above code snippet.<br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">The above

                                          examples demonstrate that <font face="monospace, monospace">BugReporter

                                          </font>sometimes reduces the

                                          bugpath too much or too

                                          little.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><b><font style="background-color:rgb(255,255,255)" size="4" color="#000000">---===

                                          Solving the above problem with

                                          static backward program

                                          slicing ===---</font></b></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><b style="background-color:rgb(255,255,255)"><font color="#000000">What is static

                                          backward program slicing?</font></b></div>

                                    <div>

                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                          </font></span></div>

                                      <div><span style="background-color:rgb(255,255,255)"><font color="#000000">A <i>program

                                              slice</i> consists of the

                                            parts of a program that

                                            (potentially) affect the

                                            values computed at some

                                            point of interest, called

                                            the slicing criterion. <i><b>Program

                                                slicing</b></i> is a

                                            decomposition technique that

                                            elides program components

                                            not relevant to the slicing

                                            criterion (which is a pair

                                            of (statement, set of

                                            variables)), creating a

                                            program slice[1][2]. <b>Static </b>slicing

                                            preserves the meaning of the

                                            variable(s) in the slicing

                                            criterion for all possible

                                            inputs[1]. <b>Backward </b>slices

                                            answer the question “what

                                            program components might

                                            affect a selected

                                            computation?”[1]</font></span></div>

                                    </div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">While

                                          statement-minimal slices are

                                          not necessarily unique[3],

                                          Weisel developed a popular

                                          algorithm that constructs one.

                                          In essence, his fix-point

                                          algorithm constructs sets of <i>relevant

                                            variables</i><i> </i>for

                                          each edge in between node <i>i</i> and

                                          node <i>j</i> in a CFG graph,

                                          from which he constructs <i>relevant

                                            statements</i>. The

                                          fix-point of the relevant

                                          statements set is the slice

                                          itself.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">One of the

                                          characteristic of his

                                          algorithm is that the

                                          resulting program slice will

                                          be executable. However, our

                                          problem doesn't require the

                                          code to be executable, so we

                                          could use a more "aggressive"

                                          approach that creates a

                                          smaller slice. An improvement

                                          to his algorithm is presented

                                          in [4].</font></span></div>

                                    <div>

                                      <div><b style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                          </font></b></div>

                                      <div><b style="background-color:rgb(255,255,255)"><font color="#000000">How does

                                            this relate to BugReporter?</font></b></div>

                                    </div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">We can show

                                          that using Weiser's algorithm,

                                          issues raised in Example 1.

                                          and Example 2. can be improved

                                          upon.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">For example

                                          1., the statement-minimal

                                          program slice with the

                                          criterion (13, <font face="monospace, monospace"><dynamically

                                            allocated int object></font>)

                                          will contain the statements 5

                                          and 6, and for example 2., the

                                          statement-minimal program

                                          slice with the criterion (4,<font face="monospace, monospace">

                                            i</font>) won't contain

                                          anything but statements 3 and

                                          4. For the latter, we can even

                                          improve the algorithm to also

                                          contain statement 7, where a

                                          call to f is made.</font></span></div>

                                    <span style="background-color:rgb(255,255,255)"><font color="#000000"><br class="gmail-m_1040859510329759042gmail-m_3651988806800932803gmail-m_444863209060437944m_6421736142923902491gmail-m_-3454961580602141710gmail-m_-6979117027997739172gmail-m_4426039388547674777gmail-Apple-interchange-newline">

                                        The analyzer, as stated earlier,

                                        gives <font face="monospace,

                                          monospace">BugReporter </font>an

                                        <font face="monospace,

                                          monospace">ExplodedNode</font>,

                                        from which the slicing criterion

                                        must be constructed. The

                                        statement corresponding to this

                                        node, coupled with the

                                        interesting regions the checker

                                        itself marked could be used to

                                        construct this slicing

                                        criterion.</font></span>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><b>Challenges</b><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">While the

                                          algorithm Weiser developed

                                          along with the improvements

                                          made by others are

                                          interprocedural, I would

                                          imagine that in

                                          implementation, it would be a

                                          challenging step from an

                                          intraprocedural prototype.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Several

                                          articles also describe

                                          pointers, references, and

                                          dynamically allocated regions,

                                          as well as gotos and other

                                          tricky parts of the language,

                                          but I still expect to see some

                                          skeletons falling out of the

                                          closet when implementing this

                                          for C++, not only C.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><b style="background-color:rgb(255,255,255)"><font color="#000000">Drawbacks</font></b></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Static

                                          slicing, as an algorithm that

                                          doesn't know anything about

                                          input values, suffers from the

                                          same issues that all static

                                          techniques do, meaning that

                                          without heuristics, it'll have

                                          to do very rough guesses,

                                          possibly leaving a far too big

                                          program slice. However, with

                                          the symbolic values the

                                          analyzer holds, this could be

                                          improved, turning this into <i>conditioned

                                            slicing, </i>as described in

                                          [1]. This however is only

                                          planned as a followup work

                                          after GSoC.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">For this

                                          reason, this project would be

                                          developed as an alternative

                                          approach to some of the

                                          techniques used in <font face="monospace, monospace">BugReporter</font>,

                                          as an optional off-by-default

                                          analyzer feature.</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><font size="4" color="#000000"><b style="background-color:rgb(255,255,255)">---=== Questions ===---</b></font></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">What do you

                                          think of this approach?</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Do you think

                                          that implementing this

                                          algorithm is achievable, but

                                          tough enough task for GSoC?</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Would you

                                          prefer to see a general

                                          program slicing library, or an

                                          analyzer-specific

                                          implementation? Traversing the

                                          <font face="monospace,

                                            monospace">ExplodedGraph </font>would

                                          be far easier in terms of what

                                          I want to achieve, but a more

                                          general approach that

                                          traverses the CFG (like <font face="monospace, monospace">llvm::DominatorTree</font><font face="arial, helvetica,

                                            sans-serif">[5]) could be

                                            beneficial to more

                                            developers, but possibly at

                                            the cost of not being able

                                            to improve the prototype

                                            with the symbolic value

                                            information the analyzer

                                            holds.</font></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><b><i style="background-color:rgb(255,255,255)"><font color="#000000">References,

                                            links</font></i></b></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">[1] <span style="font-size:13px;font-family:Arial,sans-serif">Gallagher,

                                            Keith, and David Binkley.

                                            "Program slicing." </span><i style="font-size:13px;font-family:Arial,sans-serif">2008 Frontiers of

                                            Software Maintenance</i><span style="font-size:13px;font-family:Arial,sans-serif">.</span></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[2] </span><span style="font-size:13px;font-family:Arial,sans-serif">Tip, Frank. </span><i style="font-size:13px;font-family:Arial,sans-serif">A survey of program

                                            slicing techniques</i><span style="font-size:13px;font-family:Arial,sans-serif">. Centrum voor

                                            Wiskunde en Informatica,

                                            1994.</span></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[3] </span><span style="font-size:13px;font-family:Arial,sans-serif">Weiser, Mark.

                                            "Program slicing." </span><i style="font-size:13px;font-family:Arial,sans-serif">Proceedings of the

                                            5th international conference

                                            on Software engineering</i><span style="font-size:13px;font-family:Arial,sans-serif">. IEEE Press, 1981.</span></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[4] </span><span style="font-size:13px;font-family:Arial,sans-serif">Binkley, David.

                                            "Precise executable

                                            interprocedural slices." </span><i style="font-size:13px;font-family:Arial,sans-serif">ACM Letters on

                                            Programming Languages and

                                            Systems (LOPLAS)</i><span style="font-size:13px;font-family:Arial,sans-serif"> 2.1-4

                                            (1993): 31-45.</span></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><span style="font-size:13px;font-family:Arial,sans-serif">[5] </span><font face="Arial, sans-serif"><a href="http://llvm.org/doxygen/classllvm_1_1DominatorTree.html" target="_blank">http://llvm.org/doxygen/classllvm_1_1DominatorTree.html</a></font></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Link to

                                          previous GSoC related letter I

                                          sent: <a href="http://lists.llvm.org/pipermail/cfe-dev/2019-February/061464.html" target="_blank">http://lists.llvm.org/pipermail/cfe-dev/2019-February/061464.html</a></font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000"><br>

                                        </font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Cheers,</font></span></div>

                                    <div><span style="background-color:rgb(255,255,255)"><font color="#000000">Kristóf Umann</font></span></div>

                                  </div>

                                </div>

                              </div>

                            </div>

                          </div>

                        </div>

                      </blockquote>

                    </div>

                  </div>

                </blockquote>

                <br>

              </blockquote>

              <br>

            </div>

          </blockquote>

        </div>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div></div>

</blockquote></div>