<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Forgot to answer about re-using taint across checkers. It's quite

    trivial because taint tracking is done directly through API of

    ProgramState, which is available everywhere. So you can just say

    "State->isTainted(Sym)" any time and that's it, it'll pick up

    info that was put there by the taint propagation checker.<br>

    <br>

    There's actually a plan to remove this API from ProgramState and

    turn it into a usual program state trait, and then expose it through

    a header, so that not to clutter ProgramState API with special

    getters for all sorts of stuff you can put in there. I might look

    into this soon, but it shouldn't obstruct your progress.<br>

    <br>

    <div class="moz-cite-prefix">On 9/14/18 5:17 PM, Artem Dergachev

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:25ee088a-ab7c-8cd9-829c-de757bb6c354@gmail.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <br>

      <div class="moz-cite-prefix">On 9/13/18 9:37 PM, Lou Wynn wrote:<br>

      </div>

      <blockquote type="cite"

        cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

        <meta http-equiv="Content-Type" content="text/html;

          charset=UTF-8">

        <p><font size="+1">Thank you for the updates!<br>

            <br>

            I looked through the project report of using Z3 to remove

            false positives. It is very promising. It's already in Clang

            7.<br>

            <br>

            But my problem is finding specific bugs that the static

            analyzer cannot find. For example, integer overflow which

            involves multiple symbols in an expression. I guess that the

            current static analyzer cannot handle this because Z3 only

            takes findings of the analyzer. The analyzer does not handle

            multiple symbols, so Z3 has no chance to see the entire

            expression even though Z3 can process multiple symbols.<br>

          </font></p>

      </blockquote>

      <br>

      There's one more trick that we didn't try yet: include the state

      that shouldn't be feasible in the report.<br>

      <br>

      For instance, normally we report division by zero only when the

      denominator *must* be zero on the current path. But what we can do

      is emit report from the checker anyway when the denominator is not

      known to be zero (but may potentially be zero), and only actually

      display the report to the user if Z3 agrees that the non-zero

      state is in fact infeasible (contains self-contradictory

      constraints, even if our constraint manager doesn't realize it).<br>

      <br>

      That's one of the potential approaches to finding more bugs with

      the help of Z3 refutation machinery.<br>

      <br>

      Also for integer overflows you may encounter a completely

      different problem that is currently in a worse shape than

      constraint solving, and it's integral cast representation. Static

      analyzer currently models casts by ignoring them, so the solver

      doesn't ever get a hold on this information. You'll need to lift

      this restriction, but it'll immediately upset the existing solver

      and a lot of other entities in the analyzer, so you'll have to

      make them prepared for seeing casted symbols. This may involve

      implementing the trick i mentioned above in all checkers, because

      otherwise many checkers will fail to find most of their bugs.<br>

      <br>

      <blockquote type="cite"

        cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

        <p><font size="+1"> I can rephrase this to taint propagation and

            integer expression by saying that an expression involving a

            tainted value is likely to cause integer overflow. What is

            the best way to implement this checker if I use this

            strategy? I've noticed that there is a taint propagation

            checker but haven't figured out how to use it in another

            checker. Is there any example code that uses it?<br>

          </font></p>

      </blockquote>

      <br>

      Now, the way we treat taint, we don't ever actually remove taint

      from symbols, but instead we consult both taint information and

      normal program state information before we emit the bug. For

      example, if the denominator is tainted and was not checked to be

      non-zero on the current path, we can report the bug without making

      sure that the denominator must be zero on the current path, just

      knowing that it may be zero is enough. By finding such path we

      already know that the attacker can forge the denominator to be

      zero and bypass all checks.<br>

      <br>

      In other words, taint problems are "per-path" "may" problems,

      while normal problems are "per-path" "must" problems. And for that

      purpose the existing refutation scheme is enough to solve all the

      problems. You still have problems with casts though.<br>

      <br>

      <blockquote type="cite"

        cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

        <p><font size="+1"> Another thought of combining the static

            analysis and Z3 is developing another static analyzer which

            doesn't use symbolic execution, just abstract

            interpretation. It'll be more scalable but probably cause

            many false positives, then use the current Z3 integration to

            remove false positives. If I'll go this route, where should

            I start with?<br>

          </font></p>

      </blockquote>

      <br>

      In its current shape the static analyzer is, like, 20 man-years of

      work. Part of that is because it's source-based; if you try to

      analyze, say, LLVM IR instead of Clang AST, you might reduce the

      amount of work you need to do (but it'll be trickier to explain

      the bugs you find to the user in terms of the original source

      code), but that's still a huge investment.<br>

      <br>

      You may also try to re-use transfer functions from our static

      analyzer in your analyzer (i.e., only augment the static analyzer

      with a "state merge" operation). This might work, and that's

      something we consider trying some day, but there are a lot of

      known and unknown unknowns here, so i wouldn't outright recommend

      rushing in this direction either.<br>

      <br>

      <blockquote type="cite"

        cite="mid:06c73b46-402a-66e9-313c-08c96b51867e@gmail.com">

        <pre class="moz-signature" cols="72">Thanks,

Lou

</pre>

        <div class="moz-cite-prefix">On 09/13/2018 05:31 PM, Artem

          Dergachev wrote:<br>

        </div>

        <blockquote type="cite"

          cite="mid:8c53dc57-7cad-4bf4-b94c-936e515b5393@gmail.com">

          <meta http-equiv="Content-Type" content="text/html;

            charset=UTF-8">

          There were slight improvements, but our ad-hoc constraint

          solver quickly becomes unmaintainable (algorithmic complexity

          exponentially explodes) while we try to squeeze more features

          into it.<br>

          <br>

          There was also an attempt to use Z3, i.e. a full-featured

          theorem prover, instead of our ad-hoc solver. Z3 supports

          everything, but makes the analyzer significantly slower

          (imagine 10x-20x). This is very experimental and was never

          supported, because it seems to be a dead end due to a huge

          performance hit.<br>

          <br>

          Finally, last GSoC there was an attempt to use both constraint

          solvers: use ad-hoc solver during analysis (to quickly

          eliminate infeasible paths and report bugs), and then

          cross-check with Z3 only when the actual bug report is

          emitted. This fixes the problem when it comes to eliminating

          false positives, but it doesn't allow the analyzer to find new

          classes of bugs. This completely avoids any performance

          problems and looks very promising, and while this is still not

          officially supported, we'll probably be looking more into this

          to see if we'll be able to ship this somehow, probably with a

          different SMT or SAT solver if we run into problems with Z3.

          See <a class="moz-txt-link-freetext"

            href="http://lists.llvm.org/pipermail/cfe-dev/2018-August/058912.html"

            moz-do-not-send="true">http://lists.llvm.org/pipermail/cfe-dev/2018-August/058912.html</a>

          for more details.<br>

          <br>

          So if your main problem is false positives there's much more

          hope to see a solution available soon-ish than if your main

          problem is being able to find these specific bugs.<br>

          <br>

          <div class="moz-cite-prefix">On 9/13/18 3:00 PM, Lou Wynn via

            cfe-dev wrote:<br>

          </div>

          <blockquote type="cite"

            cite="mid:d443daa5-3cc3-ed27-8a1e-dda62c8d3d1b@gmail.com">

            <meta http-equiv="content-type" content="text/html;

              charset=UTF-8">

            <p><font size="+1">Hi,</font></p>

            <p><font size="+1">I have watched the Building a Checker in

                24 Hours slide (<a class="moz-txt-link-freetext"

                  href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf"

                  moz-do-not-send="true">http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf</a>).

                It mentioned that one limitation of the Constraint

                Solver is that it can't handle multiple symbols (page

                83). The talk was given in 2012. I'm wondering if this

                limitation has been removed now in 2018.</font><br>

            </p>

            <pre class="moz-signature" cols="72">-- 

Thanks,

Lou

</pre>

            <br>

            <fieldset class="mimeAttachmentHeader"></fieldset>

            <pre class="moz-quote-pre" wrap="">_______________________________________________

cfe-dev mailing list

<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" moz-do-not-send="true">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>

</pre>

          </blockquote>

          <br>

        </blockquote>

        <br>

      </blockquote>

      <br>

    </blockquote>

    <br>

  </body>

</html>