[cfe-dev] Clang Static Analyzer conditional terminating call back

Fri Sep 20 14:34:16 PDT 2019

On 9/20/19 1:59 PM, Gavin Cui wrote:
> Thanks for the help,
> @Artem, I think the taint propagation is necessary for my problem. I 
> want to analyze if an untrust input can somehow affect the control 
> flow of some sensitive function (tainted source determine whether a 
> sensitive function get executed or not). The untrusted input can taint 
> other variables and eventually taint the branch condition expression. 
> It still needs to be path sensitive. For example:
>
> config_from_file = parse_config_file() // taint source
> /* the tainted value may infect other variables (should_enc) in some 
> paths*/
> if (use_default) {
> config = default_config // in this path, taint does not flow to 
> condition expr
> }
> else {
> config = config_from_file // taint flow to config
> }
> should_enc = (config.secure_level > 10) // taint flow to should_enc
> if (should_enc) { // branch is tainted in one path
> do_encrypt(data) // the execution of sensitive function is affected by 
> taint source in one path.
> }
> else {  // this block is also tainted if use_default
> ...
> }  // after exiting the block, everything should be fine.
> other_sensitive_func(); // not affected by taint source in both paths

What about the following test cases?

// (1):

   if (config.secure_level > 10) // not a control flow dependency of the 
sensitive call!
     should_enc = true; // concrete value, not tainted!
   else
     should_enc = false; // concrete value, not tainted!
   if (should_enc) // concrete true or false, not tainted!
     do_encrypt(data);

// (2):

   if (config.secure_level > 10)
     do_encrypt(data);
   else
     do_encrypt(data); // encryption is done on both branches anyway!

// (3):

   if (config.secure_level > 10) // tainted symbol collapsed to a constant!
     do_unrelated_stuff();
   if (config.secure_level > 10) // concrete true or false, not tainted!
     do_encrypt(data);

Basically i want to know not only about the bug you're trying to find, 
but more about what your users are and what quality requirements do you 
have.

If you're writing a tool for yourself (eg., for doing a security audit 
of a specific project), you can get away with a high false positive 
rate. If you're making a tool for automatic code review that'll point 
out potential security breaches to other developers as they write new 
code, you'll have to make sure your tool doesn't prevent the developers 
from easily writing the secure code that they need to write, so a high 
false positive rate is unacceptable, and you'll need to formulate 
precise rules in an as simple manner as possible instead of relying on 
an unpredictable emergent behavior. If you're really paranoid about 
security, you should go for a verification tool that has high false 
positive rate and zero false negatives. If you can make your own APIs, 
you should probably make safer APIs that are either taking care of the 
security issues on the type system level or generally make life easier 
for static analysis.

Also Static Analyzer is tweaked for finding very pinpointed bugs that 
can be proven by looking at a specific execution path without taking 
into account the surrounding code that didn't get executed on the 
current path. Your question seems to be focused on the difference in 
behavior between the situations in which the branch is taken or not, 
which is already too much of a global reasoning.

> @Kristof, I think ControlDependencyCalculator might do the trick. I do 
> not need to use a stack structure to track the blocks myself. Here's 
> what I might do:
> -in checkPreStmt(const CallExpr *CE, CheckerContext &C) , check if the 
> statement is a sensitive function call
> -get cfg from C->ExplodedNode()->getCFG, and create cdc = 
> ControlDependencyCalculator(cfg)
> -get dependent blocks from 
> cdc->getControlDependencies(C->ExplodedNode()->getCFGBlock())
> -for each returned block, check if the condition expr is tainted in 
> current state.

The condition expression is not an active expression at this point, so 
it doesn't have a value at all in the current state. You'll have to go 
back in time, to the moment of time where the condition was evaluated, 
in order to understand what its value was. Which is why your original 
approach was better.

You may be able to store branch conditions in the program state for 
later use in an Environment-like map, i.e. '(Expr *, LocationContext *) 
-> SVal', clean it up as location contexts are destroyed, and get them 
overwritten when looping around in a loop.

Or you can emit a bug on every sensitive function and attach a bug 
visitor to it that will suppress the report when it's unable to find the 
tainted dependency. This is probably the easiest way to implement this 
right now - not sure about performance though.

>
> If ControlDependencyCalculator can correctly calculate the dependence, 
> I think the above steps should work. I am not sure if the 
> getLastCondition()s return from dependency blocks overlaps, but it 
> will not affect the result.
>
> Gavin
>
>> On Sep 20, 2019, at 4:00 PM, Kristóf Umann <dkszelethus at gmail.com 
>> <mailto:dkszelethus at gmail.com>> wrote:
>>
>>
>>
>> On Fri, 20 Sep 2019 at 21:35, Artem Dergachev <noqnoqneo at gmail.com 
>> <mailto:noqnoqneo at gmail.com>> wrote:
>>
>>     @Gavin: I'm worried that you're choosing a wrong strategy here.
>>     Branches with tainted conditions can be used for sanitizing the
>>     input, but it sounds like you want to ban them rather than
>>     promote them. That said, i can't figure out what's the right
>>     solution for you unless i understand the original problem that
>>     you're trying to solve.
>>
>>     @Kristof: Do you think you can implement a
>>     checkBeginControlDependentSection /
>>     checkEndControlDependentSection callback pair on top of your
>>     control dependency tracking mechanisms, so that they behaved
>>     intuitively and always perfectly paired each other, even in the
>>     more complicated cases like for-loops and Duff's devices?
>>     (there's no indication so far that we really need them - scope
>>     contexts are much more valuable and might actually be helpful
>>     here as well - but i'm kinda curious).
>>
>>
>> I guess so. I'm seeing a couple things to keep track of (inlined 
>> function calls to name one), but nothing too bad.
>>
>> It raises (haha) a question about exceptions, if we ever end up 
>> supporting them, what happens if an exception is raised? Also, just 
>> came to my mind, should any block with a non-noexcept function call 
>> have an edge to the exit block if we take exceptions into account?
>>
>>     On 9/20/19 10:46 AM, Kristóf Umann via cfe-dev wrote:
>>>     + Artem because he knows everything about the analyzer and
>>>     symbolic execution, + Balázs because he is currently working on
>>>     TaintChecker.
>>>
>>>     My first instinct here would be to combine pathsensitive
>>>     analysis with control flow analysis. In the header file
>>>     clang/include/clang/Analysis/Analyses/Dominators.h you will find
>>>     the class ControlDependencyCalculator. You could calculate the
>>>     control dependencies of the block in which sensitive_func() is
>>>     called (you can retrieve that through the current ExplodedNode)
>>>     and find that the CFGBlock whose getLastCondition() is value <
>>>     xxx is in fact a control dependency. Then, you could, in theory,
>>>     check whether parts of this expression is tainted.
>>>
>>>     Artem, do you think this makes any sense?
>>>
>>>     On Fri, 20 Sep 2019 at 16:10, Gavin Cui via cfe-dev
>>>     <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>>
>>>         Hello all,
>>>         I want to check if a tainted value can affect the control
>>>         flow of some sensitive functions. For example:
>>>
>>>         value = taint_source()
>>>         if (value < xxx) {
>>>                 sensitive_func()
>>>         }
>>>
>>>         The taint propagation in clang static analyzer fit part of
>>>         my need. One approach I can think of is:
>>>         Whenever I encounter a branch condition (register
>>>         checkBranchCondition() call back), I will push a tag(tainted
>>>         or not) to a taintStack variable in ProgramState.
>>>         After the branch block closed, I will pop one tag.
>>>         If sensitive_function() get encountered, I will check all
>>>         the tags in taintStack to see if any of them is tainted.
>>>
>>>         The problem is I did not find a callback like
>>>         checkBranchCondition() which will be called every time
>>>         exiting a branch block. Then what should be a good approach
>>>         for this control flow checking?
>>>
>>>         Any suggestions would be appreciated.
>>>
>>>         Thank you,
>>>         Gavin
>>>         _______________________________________________
>>>         cfe-dev mailing list
>>>         cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>         https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>>
>>>     _______________________________________________
>>>     cfe-dev mailing list
>>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190920/de74fba6/attachment.html>