<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <br>

    <div class="moz-cite-prefix">On 9/22/19 10:43 AM, Gavin Cui wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:05C94D72-CF35-4F60-A36E-428A972183D4@gmail.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <div class="">Hi Artem,</div>

      <div class=""><br class="">

      </div>

      <div class="">I am writing a tool for myself. The propose of this

        tool is to help me prove such vulnerabilities may exist in some

        applications. (with a different threat model, some originally

        trusted value become untrusted/tainted) I will manually look at

        the report anyways, so a high false-positive rate is acceptable.

        <br>

      </div>

    </blockquote>

    <br>

    Ok! Yeah, that should get you something, at least :)<br>

    <br>

    <blockquote type="cite"

      cite="mid:05C94D72-CF35-4F60-A36E-428A972183D4@gmail.com">

      <div class="">Thank you for pointing out those cases. I do not

        have a good solution for them yet. Maybe taint analysis with

        symbolic execution is not the best approach for my problem. But

        for the current stage, I just want to have a tool to list some

        potentially vulnerable code so that I can hopefully detect at

        least one true vulnerability.</div>

      <div class=""><br class="">

      </div>

      <div class="">You are right, I can not get the value of expression

        when it leaves the current context. But how can I intercept the

        moment a location contexts get destroyed?</div>

    </blockquote>

    <br>

    For now that's basically checkEndFunction. Maybe we'll add more

    location contexts in the future, with more fine-grained callbacks

    for them.<br>

    <br>

    <blockquote type="cite"

      cite="mid:05C94D72-CF35-4F60-A36E-428A972183D4@gmail.com">

      <div class=""><br class="">

      </div>

      <div class="">Gavin</div>

      <div><br class="">

        <blockquote type="cite" class="">

          <div class="">On Sep 20, 2019, at 5:34 PM, Artem Dergachev

            <<a href="mailto:noqnoqneo@gmail.com" class=""

              moz-do-not-send="true">noqnoqneo@gmail.com</a>> wrote:</div>

          <br class="Apple-interchange-newline">

          <div class="">

            <meta http-equiv="Content-Type" content="text/html;

              charset=UTF-8" class="">

            <div text="#000000" bgcolor="#FFFFFF" class=""> <br

                class="">

              <br class="">

              <div class="moz-cite-prefix">On 9/20/19 1:59 PM, Gavin Cui

                wrote:<br class="">

              </div>

              <blockquote type="cite"

                cite="mid:DBAE6769-5396-4634-9148-D140CF102B53@gmail.com"

                class="">

                <meta http-equiv="Content-Type" content="text/html;

                  charset=UTF-8" class="">

                <div class="">Thanks for the help,</div>

                <div class="">@Artem, I think the taint propagation is

                  necessary for my problem. I want to analyze if an

                  untrust input can somehow affect the control flow of

                  some sensitive function (tainted source determine

                  whether a sensitive function get executed or not). The

                  untrusted input can taint other variables and

                  eventually taint the branch condition expression. It

                  still needs to be path sensitive. For example:</div>

                <div class=""><br class="">

                </div>

                <div class="">config_from_file = parse_config_file() //

                  taint source</div>

                <div class="">/* the tainted value may infect other

                  variables (should_enc) in some paths*/</div>

                <div class="">if (use_default) {</div>

                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>config

                  = default_config // in this path, taint does not flow

                  to condition expr</div>

                <div class="">}</div>

                <div class="">else {</div>

                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>config

                  = config_from_file // taint flow to config</div>

                <div class="">}</div>

                <div class="">should_enc = (config.secure_level > 10)

                  // taint flow to should_enc</div>

                <div class="">if (should_enc) { // branch is tainted in

                  one path</div>

                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>do_encrypt(data)

                  // the execution of sensitive function is affected by

                  taint source in one path.</div>

                <div class="">}</div>

                <div class="">else {  // this block is also tainted if

                  use_default</div>

                <div class=""><span class="Apple-tab-span" style="white-space:pre">   </span>...</div>

                <div class="">}  // after exiting the block, everything

                  should be fine.</div>

                <div class="">other_sensitive_func(); // not affected by

                  taint source in both paths</div>

              </blockquote>

              <br class="">

              What about the following test cases?<br class="">

              <br class="">

              // (1):<br class="">

              <br class="">

                if (config.secure_level > 10) // not a control flow

              dependency of the sensitive call!<br class="">

                  should_enc = true; // concrete value, not tainted!<br

                class="">

                else<br class="">

                  should_enc = false; // concrete value, not tainted!<br

                class="">

                if (should_enc) // concrete true or false, not tainted!<br

                class="">

                  do_encrypt(data);<br class="">

              <br class="">

              // (2):<br class="">

              <br class="">

                if (config.secure_level > 10)<br class="">

                  do_encrypt(data);<br class="">

                else<br class="">

                  do_encrypt(data); // encryption is done on both

              branches anyway!<br class="">

              <br class="">

              // (3):<br class="">

              <br class="">

                if (config.secure_level > 10) // tainted symbol

              collapsed to a constant!<br class="">

                  do_unrelated_stuff();<br class="">

                if (config.secure_level > 10) // concrete true or

              false, not tainted!<br class="">

                  do_encrypt(data);<br class="">

              <br class="">

              Basically i want to know not only about the bug you're

              trying to find, but more about what your users are and

              what quality requirements do you have.<br class="">

              <br class="">

              If you're writing a tool for yourself (eg., for doing a

              security audit of a specific project), you can get away

              with a high false positive rate. If you're making a tool

              for automatic code review that'll point out potential

              security breaches to other developers as they write new

              code, you'll have to make sure your tool doesn't prevent

              the developers from easily writing the secure code that

              they need to write, so a high false positive rate is

              unacceptable, and you'll need to formulate precise rules

              in an as simple manner as possible instead of relying on

              an unpredictable emergent behavior. If you're really

              paranoid about security, you should go for a verification

              tool that has high false positive rate and zero false

              negatives. If you can make your own APIs, you should

              probably make safer APIs that are either taking care of

              the security issues on the type system level or generally

              make life easier for static analysis.<br class="">

              <br class="">

              Also Static Analyzer is tweaked for finding very

              pinpointed bugs that can be proven by looking at a

              specific execution path without taking into account the

              surrounding code that didn't get executed on the current

              path. Your question seems to be focused on the difference

              in behavior between the situations in which the branch is

              taken or not, which is already too much of a global

              reasoning.<br class="">

              <br class="">

              <blockquote type="cite"

                cite="mid:DBAE6769-5396-4634-9148-D140CF102B53@gmail.com"

                class="">

                <div class="">@Kristof, I think

                  ControlDependencyCalculator might do the trick. I do

                  not need to use a stack structure to track the blocks

                  myself. Here's what I might do:</div>

                <div class="">-in checkPreStmt(const CallExpr *CE,

                  CheckerContext &C) , check if the statement is a

                  sensitive function call</div>

                <div class="">-get cfg from

                  C->ExplodedNode()->getCFG, and create cdc =

                  ControlDependencyCalculator(cfg)</div>

                <div class="">-get dependent blocks from

                  cdc->getControlDependencies(C->ExplodedNode()->getCFGBlock())</div>

                <div class="">-for each returned block, check if the

                  condition expr is tainted in current state. <br

                    class="">

                </div>

              </blockquote>

              <br class="">

              The condition expression is not an active expression at

              this point, so it doesn't have a value at all in the

              current state. You'll have to go back in time, to the

              moment of time where the condition was evaluated, in order

              to understand what its value was. Which is why your

              original approach was better.<br class="">

              <br class="">

              You may be able to store branch conditions in the program

              state for later use in an Environment-like map, i.e.

              '(Expr *, LocationContext *) -> SVal', clean it up as

              location contexts are destroyed, and get them overwritten

              when looping around in a loop.<br class="">

              <br class="">

              Or you can emit a bug on every sensitive function and

              attach a bug visitor to it that will suppress the report

              when it's unable to find the tainted dependency. This is

              probably the easiest way to implement this right now - not

              sure about performance though.<br class="">

              <br class="">

              <blockquote type="cite"

                cite="mid:DBAE6769-5396-4634-9148-D140CF102B53@gmail.com"

                class=""><br class="">

                <div class="">If ControlDependencyCalculator can

                  correctly calculate the dependence, I think the above

                  steps should work. I am not sure if the

                  getLastCondition()s return from dependency blocks

                  overlaps, but it will not affect the result.</div>

                <div class=""><br class="">

                </div>

                <div class="">Gavin</div>

                <div class=""><br class="">

                  <blockquote type="cite" class="">

                    <div class="">On Sep 20, 2019, at 4:00 PM, Kristóf

                      Umann <<a href="mailto:dkszelethus@gmail.com"

                        class="" moz-do-not-send="true">dkszelethus@gmail.com</a>>

                      wrote:</div>

                    <br class="Apple-interchange-newline">

                    <div class="">

                      <div dir="ltr" class="">

                        <div dir="ltr" class=""><br class="">

                        </div>

                        <br class="">

                        <div class="gmail_quote">

                          <div dir="ltr" class="gmail_attr">On Fri, 20

                            Sep 2019 at 21:35, Artem Dergachev <<a

                              href="mailto:noqnoqneo@gmail.com" class=""

                              moz-do-not-send="true">noqnoqneo@gmail.com</a>>

                            wrote:<br class="">

                          </div>

                          <blockquote class="gmail_quote"

                            style="margin:0px 0px 0px

                            0.8ex;border-left:1px solid

                            rgb(204,204,204);padding-left:1ex">

                            <div bgcolor="#FFFFFF" class=""> @Gavin: I'm

                              worried that you're choosing a wrong

                              strategy here. Branches with tainted

                              conditions can be used for sanitizing the

                              input, but it sounds like you want to ban

                              them rather than promote them. That said,

                              i can't figure out what's the right

                              solution for you unless i understand the

                              original problem that you're trying to

                              solve.<br class="">

                              <br class="">

                              @Kristof: Do you think you can implement a

                              checkBeginControlDependentSection /

                              checkEndControlDependentSection callback

                              pair on top of your control dependency

                              tracking mechanisms, so that they behaved

                              intuitively and always perfectly paired

                              each other, even in the more complicated

                              cases like for-loops and Duff's devices?

                              (there's no indication so far that we

                              really need them - scope contexts are much

                              more valuable and might actually be

                              helpful here as well - but i'm kinda

                              curious).<br class="">

                            </div>

                          </blockquote>

                          <div class=""><br class="">

                          </div>

                          <div class="">I guess so. I'm seeing a couple

                            things to keep track of (inlined function

                            calls to name one), but nothing too bad.</div>

                          <div class=""><br class="">

                          </div>

                          <div class="">It raises (haha) a question

                            about exceptions, if we ever end up

                            supporting them, what happens if an

                            exception is raised? Also, just came to my

                            mind, should any block with a non-<font

                              class="" face="monospace">noexcept</font><font

                              class="" face="arial, sans-serif"> function

                              call have an edge to the exit block if we

                              take exceptions into account?</font></div>

                          <div class=""> </div>

                          <blockquote class="gmail_quote"

                            style="margin:0px 0px 0px

                            0.8ex;border-left:1px solid

                            rgb(204,204,204);padding-left:1ex">

                            <div bgcolor="#FFFFFF" class="">

                              <div

                                class="gmail-m_-4944449558918191960moz-cite-prefix">On

                                9/20/19 10:46 AM, Kristóf Umann via

                                cfe-dev wrote:<br class="">

                              </div>

                              <blockquote type="cite" class="">

                                <div dir="ltr" class="">

                                  <div dir="ltr" class="">+ Artem

                                    because he knows everything about

                                    the analyzer and symbolic

                                    execution, + Balázs because he is

                                    currently working on TaintChecker.

                                    <div class=""><br class="">

                                    </div>

                                    <div class="">My first instinct here

                                      would be to combine pathsensitive

                                      analysis with control flow

                                      analysis. In the header file <font

                                        class="" face="monospace">clang/include/clang/Analysis/Analyses/Dominators.h</font>

                                      you will find the class <font

                                        class="" face="monospace">ControlDependencyCalculator</font>.

                                      You could calculate the control

                                      dependencies of the block in

                                      which <font class=""

                                        face="monospace">sensitive_func() </font>is

                                      called (you can retrieve that

                                      through the current <font

                                        class="" face="monospace">ExplodedNode</font>)

                                      and find that the <font class=""

                                        face="monospace">CFGBlock</font>

                                      whose <font class=""

                                        face="monospace">getLastCondition()</font>

                                      is <font class=""

                                        face="monospace">value < xxx</font> is

                                      in fact a control dependency.

                                      Then, you could, in theory, check

                                      whether parts of this expression

                                      is tainted.</div>

                                    <div class=""><br class="">

                                    </div>

                                    <div class="">Artem, do you think

                                      this makes any sense?</div>

                                  </div>

                                  <br class="">

                                  <div class="gmail_quote">

                                    <div dir="ltr" class="gmail_attr">On

                                      Fri, 20 Sep 2019 at 16:10, Gavin

                                      Cui via cfe-dev <<a

                                        href="mailto:cfe-dev@lists.llvm.org"

                                        target="_blank" class=""

                                        moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>

                                      wrote:<br class="">

                                    </div>

                                    <blockquote class="gmail_quote"

                                      style="margin:0px 0px 0px

                                      0.8ex;border-left:1px solid

                                      rgb(204,204,204);padding-left:1ex">Hello

                                      all,<br class="">

                                      I want to check if a tainted value

                                      can affect the control flow of

                                      some sensitive functions. For

                                      example:<br class="">

                                      <br class="">

                                      value = taint_source()<br class="">

                                      if (value < xxx) {<br class="">

                                              sensitive_func()<br

                                        class="">

                                      }<br class="">

                                      <br class="">

                                      The taint propagation in clang

                                      static analyzer fit part of my

                                      need. One approach I can think of

                                      is: <br class="">

                                      Whenever I encounter a branch

                                      condition (register

                                      checkBranchCondition() call back),

                                      I will push a tag(tainted or not)

                                      to a taintStack variable in

                                      ProgramState.<br class="">

                                      After the branch block closed, I

                                      will pop one tag. <br class="">

                                      If sensitive_function() get

                                      encountered, I will check all the

                                      tags in taintStack to see if any

                                      of them is tainted.<br class="">

                                      <br class="">

                                      The problem is I did not find a

                                      callback like

                                      checkBranchCondition() which will

                                      be called every time exiting a

                                      branch block.  Then what should be

                                      a good approach for this control

                                      flow checking?<br class="">

                                      <br class="">

                                      Any suggestions would be

                                      appreciated.<br class="">

                                      <br class="">

                                      Thank you,<br class="">

                                      Gavin<br class="">

_______________________________________________<br class="">

                                      cfe-dev mailing list<br class="">

                                      <a

                                        href="mailto:cfe-dev@lists.llvm.org"

                                        target="_blank" class=""

                                        moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br

                                        class="">

                                      <a

                                        href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"

                                        rel="noreferrer" target="_blank"

                                        class="" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br

                                        class="">

                                    </blockquote>

                                  </div>

                                </div>

                                <br class="">

                                <fieldset

                                  class="gmail-m_-4944449558918191960mimeAttachmentHeader"></fieldset>

                                <pre class="gmail-m_-4944449558918191960moz-quote-pre">_______________________________________________

cfe-dev mailing list

<a class="gmail-m_-4944449558918191960moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>

<a class="gmail-m_-4944449558918191960moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>

</pre>

                              </blockquote>

                              <br class="">

                            </div>

                          </blockquote>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                <br class="">

              </blockquote>

              <br class="">

            </div>

          </div>

        </blockquote>

      </div>

      <br class="">

    </blockquote>

    <br>

  </body>

</html>