<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    <br>
    <div class="moz-cite-prefix">On 9/22/19 10:43 AM, Gavin Cui wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:05C94D72-CF35-4F60-A36E-428A972183D4@gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <div class="">Hi Artem,</div>
      <div class=""><br class="">
      </div>
      <div class="">I am writing a tool for myself. The propose of this
        tool is to help me prove such vulnerabilities may exist in some
        applications. (with a different threat model, some originally
        trusted value become untrusted/tainted) I will manually look at
        the report anyways, so a high false-positive rate is acceptable.
        <br>
      </div>
    </blockquote>
    <br>
    Ok! Yeah, that should get you something, at least :)<br>
    <br>
    <blockquote type="cite"
      cite="mid:05C94D72-CF35-4F60-A36E-428A972183D4@gmail.com">
      <div class="">Thank you for pointing out those cases. I do not
        have a good solution for them yet. Maybe taint analysis with
        symbolic execution is not the best approach for my problem. But
        for the current stage, I just want to have a tool to list some
        potentially vulnerable code so that I can hopefully detect at
        least one true vulnerability.</div>
      <div class=""><br class="">
      </div>
      <div class="">You are right, I can not get the value of expression
        when it leaves the current context. But how can I intercept the
        moment a location contexts get destroyed?</div>
    </blockquote>
    <br>
    For now that's basically checkEndFunction. Maybe we'll add more
    location contexts in the future, with more fine-grained callbacks
    for them.<br>
    <br>
    <blockquote type="cite"
      cite="mid:05C94D72-CF35-4F60-A36E-428A972183D4@gmail.com">
      <div class=""><br class="">
      </div>
      <div class="">Gavin</div>
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">On Sep 20, 2019, at 5:34 PM, Artem Dergachev
            <<a href="mailto:noqnoqneo@gmail.com" class=""
              moz-do-not-send="true">noqnoqneo@gmail.com</a>> wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <meta http-equiv="Content-Type" content="text/html;
              charset=UTF-8" class="">
            <div text="#000000" bgcolor="#FFFFFF" class=""> <br
                class="">
              <br class="">
              <div class="moz-cite-prefix">On 9/20/19 1:59 PM, Gavin Cui
                wrote:<br class="">
              </div>
              <blockquote type="cite"
                cite="mid:DBAE6769-5396-4634-9148-D140CF102B53@gmail.com"
                class="">
                <meta http-equiv="Content-Type" content="text/html;
                  charset=UTF-8" class="">
                <div class="">Thanks for the help,</div>
                <div class="">@Artem, I think the taint propagation is
                  necessary for my problem. I want to analyze if an
                  untrust input can somehow affect the control flow of
                  some sensitive function (tainted source determine
                  whether a sensitive function get executed or not). The
                  untrusted input can taint other variables and
                  eventually taint the branch condition expression. It
                  still needs to be path sensitive. For example:</div>
                <div class=""><br class="">
                </div>
                <div class="">config_from_file = parse_config_file() //
                  taint source</div>
                <div class="">/* the tainted value may infect other
                  variables (should_enc) in some paths*/</div>
                <div class="">if (use_default) {</div>
                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>config
                  = default_config // in this path, taint does not flow
                  to condition expr</div>
                <div class="">}</div>
                <div class="">else {</div>
                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>config
                  = config_from_file // taint flow to config</div>
                <div class="">}</div>
                <div class="">should_enc = (config.secure_level > 10)
                  // taint flow to should_enc</div>
                <div class="">if (should_enc) { // branch is tainted in
                  one path</div>
                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>do_encrypt(data)
                  // the execution of sensitive function is affected by
                  taint source in one path.</div>
                <div class="">}</div>
                <div class="">else {  // this block is also tainted if
                  use_default</div>
                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>...</div>
                <div class="">}  // after exiting the block, everything
                  should be fine.</div>
                <div class="">other_sensitive_func(); // not affected by
                  taint source in both paths</div>
              </blockquote>
              <br class="">
              What about the following test cases?<br class="">
              <br class="">
              // (1):<br class="">
              <br class="">
                if (config.secure_level > 10) // not a control flow
              dependency of the sensitive call!<br class="">
                  should_enc = true; // concrete value, not tainted!<br
                class="">
                else<br class="">
                  should_enc = false; // concrete value, not tainted!<br
                class="">
                if (should_enc) // concrete true or false, not tainted!<br
                class="">
                  do_encrypt(data);<br class="">
              <br class="">
              // (2):<br class="">
              <br class="">
                if (config.secure_level > 10)<br class="">
                  do_encrypt(data);<br class="">
                else<br class="">
                  do_encrypt(data); // encryption is done on both
              branches anyway!<br class="">
              <br class="">
              // (3):<br class="">
              <br class="">
                if (config.secure_level > 10) // tainted symbol
              collapsed to a constant!<br class="">
                  do_unrelated_stuff();<br class="">
                if (config.secure_level > 10) // concrete true or
              false, not tainted!<br class="">
                  do_encrypt(data);<br class="">
              <br class="">
              Basically i want to know not only about the bug you're
              trying to find, but more about what your users are and
              what quality requirements do you have.<br class="">
              <br class="">
              If you're writing a tool for yourself (eg., for doing a
              security audit of a specific project), you can get away
              with a high false positive rate. If you're making a tool
              for automatic code review that'll point out potential
              security breaches to other developers as they write new
              code, you'll have to make sure your tool doesn't prevent
              the developers from easily writing the secure code that
              they need to write, so a high false positive rate is
              unacceptable, and you'll need to formulate precise rules
              in an as simple manner as possible instead of relying on
              an unpredictable emergent behavior. If you're really
              paranoid about security, you should go for a verification
              tool that has high false positive rate and zero false
              negatives. If you can make your own APIs, you should
              probably make safer APIs that are either taking care of
              the security issues on the type system level or generally
              make life easier for static analysis.<br class="">
              <br class="">
              Also Static Analyzer is tweaked for finding very
              pinpointed bugs that can be proven by looking at a
              specific execution path without taking into account the
              surrounding code that didn't get executed on the current
              path. Your question seems to be focused on the difference
              in behavior between the situations in which the branch is
              taken or not, which is already too much of a global
              reasoning.<br class="">
              <br class="">
              <blockquote type="cite"
                cite="mid:DBAE6769-5396-4634-9148-D140CF102B53@gmail.com"
                class="">
                <div class="">@Kristof, I think
                  ControlDependencyCalculator might do the trick. I do
                  not need to use a stack structure to track the blocks
                  myself. Here's what I might do:</div>
                <div class="">-in checkPreStmt(const CallExpr *CE,
                  CheckerContext &C) , check if the statement is a
                  sensitive function call</div>
                <div class="">-get cfg from
                  C->ExplodedNode()->getCFG, and create cdc =
                  ControlDependencyCalculator(cfg)</div>
                <div class="">-get dependent blocks from
                  cdc->getControlDependencies(C->ExplodedNode()->getCFGBlock())</div>
                <div class="">-for each returned block, check if the
                  condition expr is tainted in current state. <br
                    class="">
                </div>
              </blockquote>
              <br class="">
              The condition expression is not an active expression at
              this point, so it doesn't have a value at all in the
              current state. You'll have to go back in time, to the
              moment of time where the condition was evaluated, in order
              to understand what its value was. Which is why your
              original approach was better.<br class="">
              <br class="">
              You may be able to store branch conditions in the program
              state for later use in an Environment-like map, i.e.
              '(Expr *, LocationContext *) -> SVal', clean it up as
              location contexts are destroyed, and get them overwritten
              when looping around in a loop.<br class="">
              <br class="">
              Or you can emit a bug on every sensitive function and
              attach a bug visitor to it that will suppress the report
              when it's unable to find the tainted dependency. This is
              probably the easiest way to implement this right now - not
              sure about performance though.<br class="">
              <br class="">
              <blockquote type="cite"
                cite="mid:DBAE6769-5396-4634-9148-D140CF102B53@gmail.com"
                class=""><br class="">
                <div class="">If ControlDependencyCalculator can
                  correctly calculate the dependence, I think the above
                  steps should work. I am not sure if the
                  getLastCondition()s return from dependency blocks
                  overlaps, but it will not affect the result.</div>
                <div class=""><br class="">
                </div>
                <div class="">Gavin</div>
                <div class=""><br class="">
                  <blockquote type="cite" class="">
                    <div class="">On Sep 20, 2019, at 4:00 PM, Kristóf
                      Umann <<a href="mailto:dkszelethus@gmail.com"
                        class="" moz-do-not-send="true">dkszelethus@gmail.com</a>>
                      wrote:</div>
                    <br class="Apple-interchange-newline">
                    <div class="">
                      <div dir="ltr" class="">
                        <div dir="ltr" class=""><br class="">
                        </div>
                        <br class="">
                        <div class="gmail_quote">
                          <div dir="ltr" class="gmail_attr">On Fri, 20
                            Sep 2019 at 21:35, Artem Dergachev <<a
                              href="mailto:noqnoqneo@gmail.com" class=""
                              moz-do-not-send="true">noqnoqneo@gmail.com</a>>
                            wrote:<br class="">
                          </div>
                          <blockquote class="gmail_quote"
                            style="margin:0px 0px 0px
                            0.8ex;border-left:1px solid
                            rgb(204,204,204);padding-left:1ex">
                            <div bgcolor="#FFFFFF" class=""> @Gavin: I'm
                              worried that you're choosing a wrong
                              strategy here. Branches with tainted
                              conditions can be used for sanitizing the
                              input, but it sounds like you want to ban
                              them rather than promote them. That said,
                              i can't figure out what's the right
                              solution for you unless i understand the
                              original problem that you're trying to
                              solve.<br class="">
                              <br class="">
                              @Kristof: Do you think you can implement a
                              checkBeginControlDependentSection /
                              checkEndControlDependentSection callback
                              pair on top of your control dependency
                              tracking mechanisms, so that they behaved
                              intuitively and always perfectly paired
                              each other, even in the more complicated
                              cases like for-loops and Duff's devices?
                              (there's no indication so far that we
                              really need them - scope contexts are much
                              more valuable and might actually be
                              helpful here as well - but i'm kinda
                              curious).<br class="">
                            </div>
                          </blockquote>
                          <div class=""><br class="">
                          </div>
                          <div class="">I guess so. I'm seeing a couple
                            things to keep track of (inlined function
                            calls to name one), but nothing too bad.</div>
                          <div class=""><br class="">
                          </div>
                          <div class="">It raises (haha) a question
                            about exceptions, if we ever end up
                            supporting them, what happens if an
                            exception is raised? Also, just came to my
                            mind, should any block with a non-<font
                              class="" face="monospace">noexcept</font><font
                              class="" face="arial, sans-serif"> function
                              call have an edge to the exit block if we
                              take exceptions into account?</font></div>
                          <div class=""> </div>
                          <blockquote class="gmail_quote"
                            style="margin:0px 0px 0px
                            0.8ex;border-left:1px solid
                            rgb(204,204,204);padding-left:1ex">
                            <div bgcolor="#FFFFFF" class="">
                              <div
                                class="gmail-m_-4944449558918191960moz-cite-prefix">On
                                9/20/19 10:46 AM, Kristóf Umann via
                                cfe-dev wrote:<br class="">
                              </div>
                              <blockquote type="cite" class="">
                                <div dir="ltr" class="">
                                  <div dir="ltr" class="">+ Artem
                                    because he knows everything about
                                    the analyzer and symbolic
                                    execution, + Balázs because he is
                                    currently working on TaintChecker.
                                    <div class=""><br class="">
                                    </div>
                                    <div class="">My first instinct here
                                      would be to combine pathsensitive
                                      analysis with control flow
                                      analysis. In the header file <font
                                        class="" face="monospace">clang/include/clang/Analysis/Analyses/Dominators.h</font>
                                      you will find the class <font
                                        class="" face="monospace">ControlDependencyCalculator</font>.
                                      You could calculate the control
                                      dependencies of the block in
                                      which <font class=""
                                        face="monospace">sensitive_func() </font>is
                                      called (you can retrieve that
                                      through the current <font
                                        class="" face="monospace">ExplodedNode</font>)
                                      and find that the <font class=""
                                        face="monospace">CFGBlock</font>
                                      whose <font class=""
                                        face="monospace">getLastCondition()</font>
                                      is <font class=""
                                        face="monospace">value < xxx</font> is
                                      in fact a control dependency.
                                      Then, you could, in theory, check
                                      whether parts of this expression
                                      is tainted.</div>
                                    <div class=""><br class="">
                                    </div>
                                    <div class="">Artem, do you think
                                      this makes any sense?</div>
                                  </div>
                                  <br class="">
                                  <div class="gmail_quote">
                                    <div dir="ltr" class="gmail_attr">On
                                      Fri, 20 Sep 2019 at 16:10, Gavin
                                      Cui via cfe-dev <<a
                                        href="mailto:cfe-dev@lists.llvm.org"
                                        target="_blank" class=""
                                        moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>
                                      wrote:<br class="">
                                    </div>
                                    <blockquote class="gmail_quote"
                                      style="margin:0px 0px 0px
                                      0.8ex;border-left:1px solid
                                      rgb(204,204,204);padding-left:1ex">Hello
                                      all,<br class="">
                                      I want to check if a tainted value
                                      can affect the control flow of
                                      some sensitive functions. For
                                      example:<br class="">
                                      <br class="">
                                      value = taint_source()<br class="">
                                      if (value < xxx) {<br class="">
                                              sensitive_func()<br
                                        class="">
                                      }<br class="">
                                      <br class="">
                                      The taint propagation in clang
                                      static analyzer fit part of my
                                      need. One approach I can think of
                                      is: <br class="">
                                      Whenever I encounter a branch
                                      condition (register
                                      checkBranchCondition() call back),
                                      I will push a tag(tainted or not)
                                      to a taintStack variable in
                                      ProgramState.<br class="">
                                      After the branch block closed, I
                                      will pop one tag. <br class="">
                                      If sensitive_function() get
                                      encountered, I will check all the
                                      tags in taintStack to see if any
                                      of them is tainted.<br class="">
                                      <br class="">
                                      The problem is I did not find a
                                      callback like
                                      checkBranchCondition() which will
                                      be called every time exiting a
                                      branch block.  Then what should be
                                      a good approach for this control
                                      flow checking?<br class="">
                                      <br class="">
                                      Any suggestions would be
                                      appreciated.<br class="">
                                      <br class="">
                                      Thank you,<br class="">
                                      Gavin<br class="">
_______________________________________________<br class="">
                                      cfe-dev mailing list<br class="">
                                      <a
                                        href="mailto:cfe-dev@lists.llvm.org"
                                        target="_blank" class=""
                                        moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br
                                        class="">
                                      <a
                                        href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
                                        rel="noreferrer" target="_blank"
                                        class="" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br
                                        class="">
                                    </blockquote>
                                  </div>
                                </div>
                                <br class="">
                                <fieldset
                                  class="gmail-m_-4944449558918191960mimeAttachmentHeader"></fieldset>
                                <pre class="gmail-m_-4944449558918191960moz-quote-pre">_______________________________________________
cfe-dev mailing list
<a class="gmail-m_-4944449558918191960moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>
<a class="gmail-m_-4944449558918191960moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
                              </blockquote>
                              <br class="">
                            </div>
                          </blockquote>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                </div>
                <br class="">
              </blockquote>
              <br class="">
            </div>
          </div>
        </blockquote>
      </div>
      <br class="">
    </blockquote>
    <br>
  </body>
</html>