<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <div class="moz-cite-prefix">On 9/13/18 9:37 PM, Lou Wynn wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <p><font size="+1">Thank you for the updates!<br>

          <br>

          I looked through the project report of using Z3 to remove

          false positives. It is very promising. It's already in Clang

          7.<br>

          <br>

          But my problem is finding specific bugs that the static

          analyzer cannot find. For example, integer overflow which

          involves multiple symbols in an expression. I guess that the

          current static analyzer cannot handle this because Z3 only

          takes findings of the analyzer. The analyzer does not handle

          multiple symbols, so Z3 has no chance to see the entire

          expression even though Z3 can process multiple symbols.<br>

        </font></p>

    </blockquote>

    <br>

    There's one more trick that we didn't try yet: include the state

    that shouldn't be feasible in the report.<br>

    <br>

    For instance, normally we report division by zero only when the

    denominator *must* be zero on the current path. But what we can do

    is emit report from the checker anyway when the denominator is not

    known to be zero (but may potentially be zero), and only actually

    display the report to the user if Z3 agrees that the non-zero state

    is in fact infeasible (contains self-contradictory constraints, even

    if our constraint manager doesn't realize it).<br>

    <br>

    That's one of the potential approaches to finding more bugs with the

    help of Z3 refutation machinery.<br>

    <br>

    Also for integer overflows you may encounter a completely different

    problem that is currently in a worse shape than constraint solving,

    and it's integral cast representation. Static analyzer currently

    models casts by ignoring them, so the solver doesn't ever get a hold

    on this information. You'll need to lift this restriction, but it'll

    immediately upset the existing solver and a lot of other entities in

    the analyzer, so you'll have to make them prepared for seeing casted

    symbols. This may involve implementing the trick i mentioned above

    in all checkers, because otherwise many checkers will fail to find

    most of their bugs.<br>

    <font size="+1"></font><br>

    <blockquote type="cite"

      cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

      <p><font size="+1"> I can rephrase this to taint propagation and

          integer expression by saying that an expression involving a

          tainted value is likely to cause integer overflow. What is the

          best way to implement this checker if I use this strategy?

          I've noticed that there is a taint propagation checker but

          haven't figured out how to use it in another checker. Is there

          any example code that uses it?<br>

        </font></p>

    </blockquote>

    <br>

    Now, the way we treat taint, we don't ever actually remove taint

    from symbols, but instead we consult both taint information and

    normal program state information before we emit the bug. For

    example, if the denominator is tainted and was not checked to be

    non-zero on the current path, we can report the bug without making

    sure that the denominator must be zero on the current path, just

    knowing that it may be zero is enough. By finding such path we

    already know that the attacker can forge the denominator to be zero

    and bypass all checks.<br>

    <br>

    In other words, taint problems are "per-path" "may" problems, while

    normal problems are "per-path" "must" problems. And for that purpose

    the existing refutation scheme is enough to solve all the problems.

    You still have problems with casts though.<br>

    <font size="+1"></font><br>

    <blockquote type="cite"

      cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

      <p><font size="+1"> Another thought of combining the static

          analysis and Z3 is developing another static analyzer which

          doesn't use symbolic execution, just abstract interpretation.

          It'll be more scalable but probably cause many false

          positives, then use the current Z3 integration to remove false

          positives. If I'll go this route, where should I start with?<br>

        </font></p>

    </blockquote>

    <br>

    In its current shape the static analyzer is, like, 20 man-years of

    work. Part of that is because it's source-based; if you try to

    analyze, say, LLVM IR instead of Clang AST, you might reduce the

    amount of work you need to do (but it'll be trickier to explain the

    bugs you find to the user in terms of the original source code), but

    that's still a huge investment.<br>

    <br>

    You may also try to re-use transfer functions from our static

    analyzer in your analyzer (i.e., only augment the static analyzer

    with a "state merge" operation). This might work, and that's

    something we consider trying some day, but there are a lot of known

    and unknown unknowns here, so i wouldn't outright recommend rushing

    in this direction either.<br>

    <br>

    <blockquote type="cite"

      cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

      <pre class="moz-signature" cols="72">Thanks,

Lou

</pre>

      <div class="moz-cite-prefix">On 09/13/2018 05:31 PM, Artem

        Dergachev wrote:<br>

      </div>

      <blockquote type="cite"

        cite="mid:8c53dc57-7cad-4bf4-b94c-936e515b5393@gmail.com">

        <meta http-equiv="Content-Type" content="text/html;

          charset=UTF-8">

        There were slight improvements, but our ad-hoc constraint solver

        quickly becomes unmaintainable (algorithmic complexity

        exponentially explodes) while we try to squeeze more features

        into it.<br>

        <br>

        There was also an attempt to use Z3, i.e. a full-featured

        theorem prover, instead of our ad-hoc solver. Z3 supports

        everything, but makes the analyzer significantly slower (imagine

        10x-20x). This is very experimental and was never supported,

        because it seems to be a dead end due to a huge performance hit.<br>

        <br>

        Finally, last GSoC there was an attempt to use both constraint

        solvers: use ad-hoc solver during analysis (to quickly eliminate

        infeasible paths and report bugs), and then cross-check with Z3

        only when the actual bug report is emitted. This fixes the

        problem when it comes to eliminating false positives, but it

        doesn't allow the analyzer to find new classes of bugs. This

        completely avoids any performance problems and looks very

        promising, and while this is still not officially supported,

        we'll probably be looking more into this to see if we'll be able

        to ship this somehow, probably with a different SMT or SAT

        solver if we run into problems with Z3. See <a

          class="moz-txt-link-freetext"

          href="http://lists.llvm.org/pipermail/cfe-dev/2018-August/058912.html"

          moz-do-not-send="true">http://lists.llvm.org/pipermail/cfe-dev/2018-August/058912.html</a>

        for more details.<br>

        <br>

        So if your main problem is false positives there's much more

        hope to see a solution available soon-ish than if your main

        problem is being able to find these specific bugs.<br>

        <br>

        <div class="moz-cite-prefix">On 9/13/18 3:00 PM, Lou Wynn via

          cfe-dev wrote:<br>

        </div>

        <blockquote type="cite"

          cite="mid:d443daa5-3cc3-ed27-8a1e-dda62c8d3d1b@gmail.com">

          <meta http-equiv="content-type" content="text/html;

            charset=UTF-8">

          <p><font size="+1">Hi,</font></p>

          <p><font size="+1">I have watched the Building a Checker in 24

              Hours slide (<a class="moz-txt-link-freetext"

                href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf"

                moz-do-not-send="true">http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf</a>).

              It mentioned that one limitation of the Constraint Solver

              is that it can't handle multiple symbols (page 83). The

              talk was given in 2012. I'm wondering if this limitation

              has been removed now in 2018.</font><br>

          </p>

          <pre class="moz-signature" cols="72">-- 

Thanks,

Lou

</pre>

          <br>

          <fieldset class="mimeAttachmentHeader"></fieldset>

          <pre class="moz-quote-pre" wrap="">_______________________________________________

cfe-dev mailing list

<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>

</pre>

        </blockquote>

        <br>

      </blockquote>

      <br>

    </blockquote>

    <br>

  </body>

</html>