<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    Np!<br>
    <br>
    A few self-corrections below.<br>
    <br>
    <div class="moz-cite-prefix">On 12/14/20 6:23 AM, Vince Bridgers
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAEgb0iD4zOoA_cOsVf=UsYJcRgjKHmz-dt5dATK3-bK1k+bwdw@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">Hi Artem, thank you for taking the time to explain
        the background how to chase this down. I'll carefully read
        through the materials and references provided and continue to
        build up knowledge on how to work on this part of the static
        analyzer. 
        <div><br>
        </div>
        <div>Best - Vince </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Sun, Dec 13, 2020 at 11:55
          PM Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com"
            moz-do-not-send="true">noqnoqneo@gmail.com</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> Ok, first of all, the example false positive you're
            debugging is not an example of the bug you're trying to look
            into, but an example of a completely different long-standing
            bug: <a href="https://bugs.llvm.org/show_bug.cgi?id=44114"
              target="_blank" moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=44114</a>.
            Both bugzilla entries contain a large number of examples to
            choose from. None of those are beginner bugs though, the
            reason why they're not fixed is because they require a
            relatively large amount of work and a relatively good
            understanding of the subject.<br>
            <br>
            For your specific example, no, you didn't identify the buggy
            transition correctly, so you're debugging the part that has
            nothing to do with the false positive. I strongly recommend
            finding the buggy transition in your ExplodedGraph dump
            *before* jumping into the debugger. This usually goes from
            bottom to top, exactly like you're expected to read a static
            analyzer report. Let me explain this procedure on your
            example.<br>
            <br>
            I attached a screenshot of the tail of the ExplodedGraph
            dump for your code (i hope this works fine with the mailing
            list).</div>
        </blockquote>
      </div>
    </blockquote>
    <br>
    Narrator: "It didn't". For future generations: it was
    <a class="moz-txt-link-freetext" href="https://i.imgur.com/Q0ErgPe.png">https://i.imgur.com/Q0ErgPe.png</a><br>
    <br>
    <blockquote type="cite"
cite="mid:CAEgb0iD4zOoA_cOsVf=UsYJcRgjKHmz-dt5dATK3-bK1k+bwdw@mail.gmail.com">
      <div class="gmail_quote">
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>That's just three last nodes of the graph but they're
            enough to identify the problem.<br>
            <br>
            Node 53 is the bug node, as stated in red; this is the node
            that was produced by generateErrorNode() and fed into the
            BugReport object. The bug report says "Branch condition
            evaluates to a garbage value". The AST expression p[1]
            (which acts as the branch condition which we can confirm by
            looking at the source code) has symbolic value "Undefined".
            Therefore we can conclude that the bug report was emitted
            correctly and the our problem is not in the checker callback
            that emitted the bug report; it was simply fed an incorrect
            State 1337. Since the only reason why the bug report was
            emitted was that the value was Undefined, we have to find a
            node in which "Undefined" was written into the state.<br>
            <br>
            Node 52 doesn't update the state, so we can skip it. In fact
            it's nicely grouped with node 51 so we barely even notice
            it.<br>
            <br>
            Node 51 is the nearest node on which the AST expression p[1]
            was first evaluated into Undefined. Whatever code was
            responsible for evaluating node 51 is directly responsible
            for assigning Undefined value to the expression - which
            ultimately led to the false positive in node 53. It is easy
            to see what was supposed to happen at this node: this is a
            memory load (which is also known in C and C++ formalism as
            "implicit lvalue-to-rvalue conversion" - the operation that
            converts object-as-in-"location" into object-as-in-"data").
            The operation does not change contents of memory, it only
            reads the existing memory.<br>
            <br>
            Static analyzer represents the knowledge of existing memory
            in a data structure known as the Store, aka "Region Store".
            Straightforwardly, it maps "memory regions" (symbolic
            representations of locations) into arbitrary SVals (symbolic
            representations of data). No, Balazs, it doesn't map "values
            to expressions" =/ The Store is written down in the dump and
            labeled accordingly. In our case there are three entries in
            the Store:<br>
            <br>
            (1) Variable "myvar" has 16-bit unsigned integer 4386
            written into it at byte offset 0;<br>
            (2) Variable "p" has a pointer to the first char (hence
            index 0) of variable "myvar" written into it at byte offset
            0.<br>
            (3) Variable "x" has 32-bit signed integer 1 written into it
            at byte offset 0.<br>
            <br>
            The location we're loading from is already evaluated: it's
            the value of AST expression p[1] before the load. The
            Expressions section (aka the Environment) contains two
            entries for p[1] because implicit lvalue-to-rvalue casts do
            not have any visual representation in C, however by matching
            statement identifiers (S780 and S784) with the respective
            program point dumps (S780: ArraySubscriptExpr, S784:
            ImplicitCastExpr (LValueToRValue)) you can easily see that
            S780 is the sub-expression of the load and S784 is the load
            itself. Therefore the location which we're loading from is
            the value of AST expression p[1], namely the *second* char
            of variable "myvar".<br>
            <br>
            Knowing these facts about the memory of our program, we need
            to decide whether the load was performed correctly. If the
            load was performed incorrectly, our job is done: we've found
            the root cause. If the load was performed correctly that'd
            have meant that the state is incorrect and we have to go
            further up in order to find it.<br>
            <br>
            It's fairly obvious that the load was not performed
            correctly. Looking at Store entry (1), it is easy to see
            that the second character of variable "myvar" has value 17
            rather than "Undefined".<br>
            <br>
            Therefore our problem consists entirely in incorrect
            construction of node 51. Now's the good time to jump into
            the debugger, set conditional breakpoint on evalLoad with
            condition "predecessor node has identifier 50" and debug
            from there.<br>
            <br>
            The pointer cast that you were trying to debug was not at
            fault. We can still discuss it separately but it's not what
            really causes this specific false positive to show up.<br>
            <br>
            This entire debugging procedure is very straightforward and
            requires almost no thinking. You simply match static
            analyzer's simulation of the program's behavior to the
            actual runtime behavior of the program and check whether the
            former is a correct abstraction over the latter. Static
            analyzer is just an interpreter (cf. "abstract
            interpretation") so it's very easy to reason about the
            correctness of every transition as long as you know the
            language it's trying to interpret (in this case, C). Which
            is why i made this 2018 conference talk in order to explain
            it.</div>
        </blockquote>
      </div>
    </blockquote>
    <br>
    *2019. The link is <a class="moz-txt-link-freetext" href="https://www.youtube.com/watch?v=g0Mqx1niUi0">https://www.youtube.com/watch?v=g0Mqx1niUi0</a><br>
    The 2018 one is not particularly relevant here.<br>
    <br>
    <blockquote type="cite"
cite="mid:CAEgb0iD4zOoA_cOsVf=UsYJcRgjKHmz-dt5dATK3-bK1k+bwdw@mail.gmail.com">
      <div class="gmail_quote">
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> The only non-trivial part is knowing the structure of
            the program state (store vs environment, what are regions
            and symbols and how to read them) which is why i did this
            whole workbook thing on that subject in 2016. I wish i did
            it properly so that all the documentation was in one place
            but for now that's where we are i guess.<br>
            <br>
            <div>On 12/7/20 7:40 AM, Balázs Benics via cfe-dev wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>Ok, you are saying that we should know that `p[1]`
                  points to an object which was initialized. And
                  according to the target's endianness, resolve the
                  DeclRefExpr to the appropriate constant value.</div>
                <div><br>
                </div>
                <div>The dumps are probably from the line `char *p =
                  &myvar;` if I'm right.<br>
                </div>
                <div>That is the reason why the right-hand side is
                  evaluated to the location of `myvar`, but wrapped into
                  an ElementRegion of type char representing the
                  reinterpret cast.</div>
                <div>IMO, it's the correct behavior to this point.<br>
                </div>
                <div><br>
                </div>
                <div>You should probably dig around the evaluation of
                  the `DeclRefExpr` of the expression `p[1]`. That
                  should return your concrete value.<br>
                </div>
                <div>AFAIK, the store maps the values to expressions and
                  the problem probably resides there.<br>
                </div>
                <div>But it's just a wide guess :D</div>
                <div><br>
                </div>
                <div>Unfortunately, that is all I can say about this
                  right now :(<br>
                </div>
                <div><br>
                </div>
                <div>Balazs.<br>
                </div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">Vince Bridgers via
                  cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org"
                    target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>
                  ezt írta (időpont: 2020. dec. 7., H, 11:54):<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px
                  0px 0.8ex;border-left:1px solid
                  rgb(204,204,204);padding-left:1ex">
                  <div dir="ltr">Hi all, I thought I'd look at this
                    problem <a
                      href="https://bugs.llvm.org/show_bug.cgi?id=43364"
                      target="_blank" moz-do-not-send="true">https://bugs.llvm.org/show_bug.cgi?id=43364</a>
                    (Pointer cast representation problems). maybe use it
                    as an opportunity to dig into the inner workings of
                    the analyzer and maybe even solve a concrete problem
                    :)   But it seems I need guidance about possible
                    paths to pursue. 
                    <div><br>
                    </div>
                    <div>Starting with a concrete case, a simple
                      reproducer is below. I debugged this down to
                      SimpleSValBuilder.cpp:evalCastFromLoc(). val is an
                      SVal and castTy is a Loc, and this code path
                      attempts to extract a concrete integer from the
                      SVal (did I get this right?). So I think a
                      solution to this specific case would be to dig
                      into the SVal to see if casted data is concrete,
                      and extract that data. Seems to me this would be
                      the location for that (evalCastFromLoc seems
                      appropropriate enough). If that's true, can
                      someone point me to an example that's similar?
                      I'll keep digging, but thought I'd ask in case
                      this is easy for someone to drop a few helpful
                      hints. </div>
                    <div><br>
                    </div>
                    <div>I've included some select dumps, state and a bt
                      below in case this is helpful. </div>
                    <div><br>
                    </div>
                    <div>Best</div>
                    <div><br>
                    </div>
                    <div>(gdb) p val.dump()<br>
                      &Element{myvar,0 S64b,char}$2 = void<br>
                    </div>
                    <div><br>
                    </div>
                    <div>(gdb) p castTy.dump()<br>
                      PointerType 0xf931740 'char *'<br>
                      `-BuiltinType 0xf930b50 'char'<br>
                      $3 = void<br>
                    </div>
                    <div><br>
                    </div>
                    <div><br>
                    </div>
                    <div>Invocation: clang -cc1 -analyze
                      -analyzer-checker=core case.c <br>
                    </div>
                    <div><br>
                    </div>
                    <div>struct mystruct {<br>
                    </div>
                    <div>  unsigned short _u16;<br>
                      };<br>
                      <br>
                      int main(void) {<br>
                        struct mystruct myvar = { 0x1122 };<br>
                      <br>
                        char *p = &myvar;<br>
                        int x = 0;<br>
                        if (p[0])<br>
                          x+=1;<br>
                        if (p[1]) // Branch condition evaluates to a
                      garbage value [core.uninitialized.Branch]<br>
                          x+=1;<br>
                        return x;<br>
                      }<br>
                    </div>
                    <div><br>
                    </div>
                    <div>(gdb) p state<br>
                      $4 = {Obj = 0xf9b81b8}<br>
                      (gdb) p state->dump()<br>
                      "program_state": {<br>
                        "store": { "pointer": "0xf9b6fa2", "items": [<br>
                          { "cluster": "myvar", "pointer": "0xf9b2250",
                      "items": [<br>
                            { "kind": "Direct", "offset": 0, "value":
                      "4386 U16b" }<br>
                          ]},<br>
                          { "cluster": "p", "pointer": "0xf9b2aa8",
                      "items": [<br>
                            { "kind": "Direct", "offset": 0, "value":
                      "&Element{myvar,0 S64b,char}" }<br>
                          ]},<br>
                          { "cluster": "x", "pointer": "0xf9b6ed0",
                      "items": [<br>
                            { "kind": "Direct", "offset": 0, "value": "1
                      S32b" }<br>
                          ]}<br>
                        ]},<br>
                        "environment": { "pointer": "0xf9b16e0",
                      "items": [<br>
                          { "lctx_id": 1, "location_context": "#0 Call",
                      "calling": "main", "location": null, "items": [<br>
                            { "stmt_id": 898, "pretty": "p", "value":
                      "&p" }<br>
                          ]}<br>
                        ]},<br>
                        "constraints": null,<br>
                        "dynamic_types": null,<br>
                        "dynamic_casts": null,<br>
                        "constructing_objects": null,<br>
                        "checker_messages": null<br>
                      }$5 = void<br>
                    </div>
                    <div><br>
                    </div>
                    <div><br>
                    </div>
                    <div>(gdb) bt<br>
                      #0  (anonymous
                      namespace)::SimpleSValBuilder::evalCastFromLoc
                      (this=0xf9aedf0, val=..., castTy=...) at
                      ../../clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:136<br>
                      #1  0x0000000007ec6ecb in (anonymous
                      namespace)::SimpleSValBuilder::dispatchCast
                      (this=0xf9aedf0, Val=..., CastTy=...) at
                      ../../clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:70<br>
                      #2  0x0000000007edd75e in
                      clang::ento::StoreManager::CastRetrievedVal
                      (this=0xf9b02a0, V=..., R=0xf9b2aa8, castTy=...)
                      at
                      ../../clang/lib/StaticAnalyzer/Core/Store.cpp:438<br>
                      #3  0x0000000007ea0480 in (anonymous
                      namespace)::RegionStoreManager::getBinding
                      (this=0xf9b02a0, B=..., L=..., T=...) at
                      ../../clang/lib/StaticAnalyzer/Core/RegionStore.cpp:1511<br>
                      #4  0x0000000007e9d059 in (anonymous
                      namespace)::RegionStoreManager::getBinding
                      (this=0xf9b02a0, S=0xf9b6fa2, L=..., T=...) at
                      ../../clang/lib/StaticAnalyzer/Core/RegionStore.cpp:551<br>
                      #5  0x0000000007753856 in
                      clang::ento::ProgramState::getRawSVal
                      (this=0xf9b81b8, LV=..., T=...) at
../../clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h:785<br>
                      #6  0x0000000007e64d61 in
                      clang::ento::ProgramState::getSVal
                      (this=0xf9b81b8, location=..., T=...) at
                      ../../clang/lib/StaticAnalyzer/Core/ProgramState.cpp:267<br>
                      #7  0x0000000007df0d77 in
                      clang::ento::ExprEngine::evalLoad
                      (this=0x7fffffffa828, Dst=..., NodeEx=0xf985e00,
                      BoundEx=0xf985e00, Pred=0xf9b8230, state=...,
                      location=..., tag=0x0, LoadTy=...)<br>
                          at
                      ../../clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:2982<br>
                      #8  0x0000000007e10402 in
                      clang::ento::ExprEngine::VisitCast
                      (this=0x7fffffffa828, CastE=0xf985e00,
                      Ex=0xf985dc0, Pred=0xf9b8230, Dst=...) at
                      ../../clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp:313<br>
                      #9  0x0000000007de88a8 in
                      clang::ento::ExprEngine::Visit
                      (this=0x7fffffffa828, S=0xf985e00, Pred=0xf9b8230,
                      DstTop=...) at
                      ../../clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:1729<br>
                      #10 0x0000000007de4d9c in
                      clang::ento::ExprEngine::ProcessStmt
                      (this=0x7fffffffa828, currStmt=0xf985e00,
                      Pred=0xf9b8230) at
                      ../../clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:791<br>
                      #11 0x0000000007de4a89 in
                      clang::ento::ExprEngine::processCFGElement
                      (this=0x7fffffffa828, E=..., Pred=0xf9b8230,
                      StmtIdx=1, Ctx=0x7fffffffa338) at
                      ../../clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:637<br>
                      #12 0x0000000007db6fa9 in
                      clang::ento::CoreEngine::HandlePostStmt
                      (this=0x7fffffffa848, B=0xf9a8860, StmtIdx=1,
                      Pred=0xf9b8230) at
                      ../../clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:466<br>
                      #13 0x0000000007db6717 in
                      clang::ento::CoreEngine::dispatchWorkItem
                      (this=0x7fffffffa848, Pred=0xf9b8230, Loc=...,
                      WU=...) at
                      ../../clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:191<br>
                      #14 0x0000000007db6323 in
                      clang::ento::CoreEngine::ExecuteWorkList
                      (this=0x7fffffffa848, L=0xf9b16e0, Steps=224975,
                      InitState=...) at
                      ../../clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:147<br>
                      #15 0x000000000734dc74 in
                      clang::ento::ExprEngine::ExecuteWorkList
                      (this=0x7fffffffa828, L=0xf9b16e0, Steps=225000)
                      at
../../clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:192<br>
                    </div>
                  </div>
                  _______________________________________________<br>
                  cfe-dev mailing list<br>
                  <a href="mailto:cfe-dev@lists.llvm.org"
                    target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
                  <a
                    href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
                </blockquote>
              </div>
              <br>
              <fieldset></fieldset>
              <pre>_______________________________________________
cfe-dev mailing list
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>