[cfe-dev] Clang Static Analyzer conditional terminating call back

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Sep 23 13:41:05 PDT 2019



On 9/22/19 10:43 AM, Gavin Cui wrote:
> Hi Artem,
>
> I am writing a tool for myself. The propose of this tool is to help me 
> prove such vulnerabilities may exist in some applications. (with a 
> different threat model, some originally trusted value become 
> untrusted/tainted) I will manually look at the report anyways, so a 
> high false-positive rate is acceptable.

Ok! Yeah, that should get you something, at least :)

> Thank you for pointing out those cases. I do not have a good solution 
> for them yet. Maybe taint analysis with symbolic execution is not the 
> best approach for my problem. But for the current stage, I just want 
> to have a tool to list some potentially vulnerable code so that I can 
> hopefully detect at least one true vulnerability.
>
> You are right, I can not get the value of expression when it leaves 
> the current context. But how can I intercept the moment a location 
> contexts get destroyed?

For now that's basically checkEndFunction. Maybe we'll add more location 
contexts in the future, with more fine-grained callbacks for them.

>
> Gavin
>
>> On Sep 20, 2019, at 5:34 PM, Artem Dergachev <noqnoqneo at gmail.com 
>> <mailto:noqnoqneo at gmail.com>> wrote:
>>
>>
>>
>> On 9/20/19 1:59 PM, Gavin Cui wrote:
>>> Thanks for the help,
>>> @Artem, I think the taint propagation is necessary for my problem. I 
>>> want to analyze if an untrust input can somehow affect the control 
>>> flow of some sensitive function (tainted source determine whether a 
>>> sensitive function get executed or not). The untrusted input can 
>>> taint other variables and eventually taint the branch condition 
>>> expression. It still needs to be path sensitive. For example:
>>>
>>> config_from_file = parse_config_file() // taint source
>>> /* the tainted value may infect other variables (should_enc) in some 
>>> paths*/
>>> if (use_default) {
>>> config = default_config // in this path, taint does not flow to 
>>> condition expr
>>> }
>>> else {
>>> config = config_from_file // taint flow to config
>>> }
>>> should_enc = (config.secure_level > 10) // taint flow to should_enc
>>> if (should_enc) { // branch is tainted in one path
>>> do_encrypt(data) // the execution of sensitive function is affected 
>>> by taint source in one path.
>>> }
>>> else {  // this block is also tainted if use_default
>>> ...
>>> }  // after exiting the block, everything should be fine.
>>> other_sensitive_func(); // not affected by taint source in both paths
>>
>> What about the following test cases?
>>
>> // (1):
>>
>>   if (config.secure_level > 10) // not a control flow dependency of 
>> the sensitive call!
>>     should_enc = true; // concrete value, not tainted!
>>   else
>>     should_enc = false; // concrete value, not tainted!
>>   if (should_enc) // concrete true or false, not tainted!
>>     do_encrypt(data);
>>
>> // (2):
>>
>>   if (config.secure_level > 10)
>>     do_encrypt(data);
>>   else
>>     do_encrypt(data); // encryption is done on both branches anyway!
>>
>> // (3):
>>
>>   if (config.secure_level > 10) // tainted symbol collapsed to a 
>> constant!
>>     do_unrelated_stuff();
>>   if (config.secure_level > 10) // concrete true or false, not tainted!
>>     do_encrypt(data);
>>
>> Basically i want to know not only about the bug you're trying to 
>> find, but more about what your users are and what quality 
>> requirements do you have.
>>
>> If you're writing a tool for yourself (eg., for doing a security 
>> audit of a specific project), you can get away with a high false 
>> positive rate. If you're making a tool for automatic code review 
>> that'll point out potential security breaches to other developers as 
>> they write new code, you'll have to make sure your tool doesn't 
>> prevent the developers from easily writing the secure code that they 
>> need to write, so a high false positive rate is unacceptable, and 
>> you'll need to formulate precise rules in an as simple manner as 
>> possible instead of relying on an unpredictable emergent behavior. If 
>> you're really paranoid about security, you should go for a 
>> verification tool that has high false positive rate and zero false 
>> negatives. If you can make your own APIs, you should probably make 
>> safer APIs that are either taking care of the security issues on the 
>> type system level or generally make life easier for static analysis.
>>
>> Also Static Analyzer is tweaked for finding very pinpointed bugs that 
>> can be proven by looking at a specific execution path without taking 
>> into account the surrounding code that didn't get executed on the 
>> current path. Your question seems to be focused on the difference in 
>> behavior between the situations in which the branch is taken or not, 
>> which is already too much of a global reasoning.
>>
>>> @Kristof, I think ControlDependencyCalculator might do the trick. I 
>>> do not need to use a stack structure to track the blocks myself. 
>>> Here's what I might do:
>>> -in checkPreStmt(const CallExpr *CE, CheckerContext &C) , check if 
>>> the statement is a sensitive function call
>>> -get cfg from C->ExplodedNode()->getCFG, and create cdc = 
>>> ControlDependencyCalculator(cfg)
>>> -get dependent blocks from 
>>> cdc->getControlDependencies(C->ExplodedNode()->getCFGBlock())
>>> -for each returned block, check if the condition expr is tainted in 
>>> current state.
>>
>> The condition expression is not an active expression at this point, 
>> so it doesn't have a value at all in the current state. You'll have 
>> to go back in time, to the moment of time where the condition was 
>> evaluated, in order to understand what its value was. Which is why 
>> your original approach was better.
>>
>> You may be able to store branch conditions in the program state for 
>> later use in an Environment-like map, i.e. '(Expr *, LocationContext 
>> *) -> SVal', clean it up as location contexts are destroyed, and get 
>> them overwritten when looping around in a loop.
>>
>> Or you can emit a bug on every sensitive function and attach a bug 
>> visitor to it that will suppress the report when it's unable to find 
>> the tainted dependency. This is probably the easiest way to implement 
>> this right now - not sure about performance though.
>>
>>>
>>> If ControlDependencyCalculator can correctly calculate the 
>>> dependence, I think the above steps should work. I am not sure if 
>>> the getLastCondition()s return from dependency blocks overlaps, but 
>>> it will not affect the result.
>>>
>>> Gavin
>>>
>>>> On Sep 20, 2019, at 4:00 PM, Kristóf Umann <dkszelethus at gmail.com 
>>>> <mailto:dkszelethus at gmail.com>> wrote:
>>>>
>>>>
>>>>
>>>> On Fri, 20 Sep 2019 at 21:35, Artem Dergachev <noqnoqneo at gmail.com 
>>>> <mailto:noqnoqneo at gmail.com>> wrote:
>>>>
>>>>     @Gavin: I'm worried that you're choosing a wrong strategy here.
>>>>     Branches with tainted conditions can be used for sanitizing the
>>>>     input, but it sounds like you want to ban them rather than
>>>>     promote them. That said, i can't figure out what's the right
>>>>     solution for you unless i understand the original problem that
>>>>     you're trying to solve.
>>>>
>>>>     @Kristof: Do you think you can implement a
>>>>     checkBeginControlDependentSection /
>>>>     checkEndControlDependentSection callback pair on top of your
>>>>     control dependency tracking mechanisms, so that they behaved
>>>>     intuitively and always perfectly paired each other, even in the
>>>>     more complicated cases like for-loops and Duff's devices?
>>>>     (there's no indication so far that we really need them - scope
>>>>     contexts are much more valuable and might actually be helpful
>>>>     here as well - but i'm kinda curious).
>>>>
>>>>
>>>> I guess so. I'm seeing a couple things to keep track of (inlined 
>>>> function calls to name one), but nothing too bad.
>>>>
>>>> It raises (haha) a question about exceptions, if we ever end up 
>>>> supporting them, what happens if an exception is raised? Also, just 
>>>> came to my mind, should any block with a non-noexcept function call 
>>>> have an edge to the exit block if we take exceptions into account?
>>>>
>>>>     On 9/20/19 10:46 AM, Kristóf Umann via cfe-dev wrote:
>>>>>     + Artem because he knows everything about the analyzer and
>>>>>     symbolic execution, + Balázs because he is currently working
>>>>>     on TaintChecker.
>>>>>
>>>>>     My first instinct here would be to combine pathsensitive
>>>>>     analysis with control flow analysis. In the header file
>>>>>     clang/include/clang/Analysis/Analyses/Dominators.h you will
>>>>>     find the class ControlDependencyCalculator. You could
>>>>>     calculate the control dependencies of the block in which
>>>>>     sensitive_func() is called (you can retrieve that through the
>>>>>     current ExplodedNode) and find that the CFGBlock whose
>>>>>     getLastCondition() is value < xxx is in fact a control
>>>>>     dependency. Then, you could, in theory, check whether parts of
>>>>>     this expression is tainted.
>>>>>
>>>>>     Artem, do you think this makes any sense?
>>>>>
>>>>>     On Fri, 20 Sep 2019 at 16:10, Gavin Cui via cfe-dev
>>>>>     <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>>>>
>>>>>         Hello all,
>>>>>         I want to check if a tainted value can affect the control
>>>>>         flow of some sensitive functions. For example:
>>>>>
>>>>>         value = taint_source()
>>>>>         if (value < xxx) {
>>>>>                 sensitive_func()
>>>>>         }
>>>>>
>>>>>         The taint propagation in clang static analyzer fit part of
>>>>>         my need. One approach I can think of is:
>>>>>         Whenever I encounter a branch condition (register
>>>>>         checkBranchCondition() call back), I will push a
>>>>>         tag(tainted or not) to a taintStack variable in ProgramState.
>>>>>         After the branch block closed, I will pop one tag.
>>>>>         If sensitive_function() get encountered, I will check all
>>>>>         the tags in taintStack to see if any of them is tainted.
>>>>>
>>>>>         The problem is I did not find a callback like
>>>>>         checkBranchCondition() which will be called every time
>>>>>         exiting a branch block.  Then what should be a good
>>>>>         approach for this control flow checking?
>>>>>
>>>>>         Any suggestions would be appreciated.
>>>>>
>>>>>         Thank you,
>>>>>         Gavin
>>>>>         _______________________________________________
>>>>>         cfe-dev mailing list
>>>>>         cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>>         https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>
>>>>>
>>>>>     _______________________________________________
>>>>>     cfe-dev mailing list
>>>>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>>>>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190923/47a5e900/attachment.html>


More information about the cfe-dev mailing list