[cfe-dev] Clang Static Analyzer conditional terminating call back
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Mon Sep 23 13:41:05 PDT 2019
On 9/22/19 10:43 AM, Gavin Cui wrote:
> Hi Artem,
>
> I am writing a tool for myself. The propose of this tool is to help me
> prove such vulnerabilities may exist in some applications. (with a
> different threat model, some originally trusted value become
> untrusted/tainted) I will manually look at the report anyways, so a
> high false-positive rate is acceptable.
Ok! Yeah, that should get you something, at least :)
> Thank you for pointing out those cases. I do not have a good solution
> for them yet. Maybe taint analysis with symbolic execution is not the
> best approach for my problem. But for the current stage, I just want
> to have a tool to list some potentially vulnerable code so that I can
> hopefully detect at least one true vulnerability.
>
> You are right, I can not get the value of expression when it leaves
> the current context. But how can I intercept the moment a location
> contexts get destroyed?
For now that's basically checkEndFunction. Maybe we'll add more location
contexts in the future, with more fine-grained callbacks for them.
>
> Gavin
>
>> On Sep 20, 2019, at 5:34 PM, Artem Dergachev <noqnoqneo at gmail.com
>> <mailto:noqnoqneo at gmail.com>> wrote:
>>
>>
>>
>> On 9/20/19 1:59 PM, Gavin Cui wrote:
>>> Thanks for the help,
>>> @Artem, I think the taint propagation is necessary for my problem. I
>>> want to analyze if an untrust input can somehow affect the control
>>> flow of some sensitive function (tainted source determine whether a
>>> sensitive function get executed or not). The untrusted input can
>>> taint other variables and eventually taint the branch condition
>>> expression. It still needs to be path sensitive. For example:
>>>
>>> config_from_file = parse_config_file() // taint source
>>> /* the tainted value may infect other variables (should_enc) in some
>>> paths*/
>>> if (use_default) {
>>> config = default_config // in this path, taint does not flow to
>>> condition expr
>>> }
>>> else {
>>> config = config_from_file // taint flow to config
>>> }
>>> should_enc = (config.secure_level > 10) // taint flow to should_enc
>>> if (should_enc) { // branch is tainted in one path
>>> do_encrypt(data) // the execution of sensitive function is affected
>>> by taint source in one path.
>>> }
>>> else { // this block is also tainted if use_default
>>> ...
>>> } // after exiting the block, everything should be fine.
>>> other_sensitive_func(); // not affected by taint source in both paths
>>
>> What about the following test cases?
>>
>> // (1):
>>
>> if (config.secure_level > 10) // not a control flow dependency of
>> the sensitive call!
>> should_enc = true; // concrete value, not tainted!
>> else
>> should_enc = false; // concrete value, not tainted!
>> if (should_enc) // concrete true or false, not tainted!
>> do_encrypt(data);
>>
>> // (2):
>>
>> if (config.secure_level > 10)
>> do_encrypt(data);
>> else
>> do_encrypt(data); // encryption is done on both branches anyway!
>>
>> // (3):
>>
>> if (config.secure_level > 10) // tainted symbol collapsed to a
>> constant!
>> do_unrelated_stuff();
>> if (config.secure_level > 10) // concrete true or false, not tainted!
>> do_encrypt(data);
>>
>> Basically i want to know not only about the bug you're trying to
>> find, but more about what your users are and what quality
>> requirements do you have.
>>
>> If you're writing a tool for yourself (eg., for doing a security
>> audit of a specific project), you can get away with a high false
>> positive rate. If you're making a tool for automatic code review
>> that'll point out potential security breaches to other developers as
>> they write new code, you'll have to make sure your tool doesn't
>> prevent the developers from easily writing the secure code that they
>> need to write, so a high false positive rate is unacceptable, and
>> you'll need to formulate precise rules in an as simple manner as
>> possible instead of relying on an unpredictable emergent behavior. If
>> you're really paranoid about security, you should go for a
>> verification tool that has high false positive rate and zero false
>> negatives. If you can make your own APIs, you should probably make
>> safer APIs that are either taking care of the security issues on the
>> type system level or generally make life easier for static analysis.
>>
>> Also Static Analyzer is tweaked for finding very pinpointed bugs that
>> can be proven by looking at a specific execution path without taking
>> into account the surrounding code that didn't get executed on the
>> current path. Your question seems to be focused on the difference in
>> behavior between the situations in which the branch is taken or not,
>> which is already too much of a global reasoning.
>>
>>> @Kristof, I think ControlDependencyCalculator might do the trick. I
>>> do not need to use a stack structure to track the blocks myself.
>>> Here's what I might do:
>>> -in checkPreStmt(const CallExpr *CE, CheckerContext &C) , check if
>>> the statement is a sensitive function call
>>> -get cfg from C->ExplodedNode()->getCFG, and create cdc =
>>> ControlDependencyCalculator(cfg)
>>> -get dependent blocks from
>>> cdc->getControlDependencies(C->ExplodedNode()->getCFGBlock())
>>> -for each returned block, check if the condition expr is tainted in
>>> current state.
>>
>> The condition expression is not an active expression at this point,
>> so it doesn't have a value at all in the current state. You'll have
>> to go back in time, to the moment of time where the condition was
>> evaluated, in order to understand what its value was. Which is why
>> your original approach was better.
>>
>> You may be able to store branch conditions in the program state for
>> later use in an Environment-like map, i.e. '(Expr *, LocationContext
>> *) -> SVal', clean it up as location contexts are destroyed, and get
>> them overwritten when looping around in a loop.
>>
>> Or you can emit a bug on every sensitive function and attach a bug
>> visitor to it that will suppress the report when it's unable to find
>> the tainted dependency. This is probably the easiest way to implement
>> this right now - not sure about performance though.
>>
>>>
>>> If ControlDependencyCalculator can correctly calculate the
>>> dependence, I think the above steps should work. I am not sure if
>>> the getLastCondition()s return from dependency blocks overlaps, but
>>> it will not affect the result.
>>>
>>> Gavin
>>>
>>>> On Sep 20, 2019, at 4:00 PM, Kristóf Umann <dkszelethus at gmail.com
>>>> <mailto:dkszelethus at gmail.com>> wrote:
>>>>
>>>>
>>>>
>>>> On Fri, 20 Sep 2019 at 21:35, Artem Dergachev <noqnoqneo at gmail.com
>>>> <mailto:noqnoqneo at gmail.com>> wrote:
>>>>
>>>> @Gavin: I'm worried that you're choosing a wrong strategy here.
>>>> Branches with tainted conditions can be used for sanitizing the
>>>> input, but it sounds like you want to ban them rather than
>>>> promote them. That said, i can't figure out what's the right
>>>> solution for you unless i understand the original problem that
>>>> you're trying to solve.
>>>>
>>>> @Kristof: Do you think you can implement a
>>>> checkBeginControlDependentSection /
>>>> checkEndControlDependentSection callback pair on top of your
>>>> control dependency tracking mechanisms, so that they behaved
>>>> intuitively and always perfectly paired each other, even in the
>>>> more complicated cases like for-loops and Duff's devices?
>>>> (there's no indication so far that we really need them - scope
>>>> contexts are much more valuable and might actually be helpful
>>>> here as well - but i'm kinda curious).
>>>>
>>>>
>>>> I guess so. I'm seeing a couple things to keep track of (inlined
>>>> function calls to name one), but nothing too bad.
>>>>
>>>> It raises (haha) a question about exceptions, if we ever end up
>>>> supporting them, what happens if an exception is raised? Also, just
>>>> came to my mind, should any block with a non-noexcept function call
>>>> have an edge to the exit block if we take exceptions into account?
>>>>
>>>> On 9/20/19 10:46 AM, Kristóf Umann via cfe-dev wrote:
>>>>> + Artem because he knows everything about the analyzer and
>>>>> symbolic execution, + Balázs because he is currently working
>>>>> on TaintChecker.
>>>>>
>>>>> My first instinct here would be to combine pathsensitive
>>>>> analysis with control flow analysis. In the header file
>>>>> clang/include/clang/Analysis/Analyses/Dominators.h you will
>>>>> find the class ControlDependencyCalculator. You could
>>>>> calculate the control dependencies of the block in which
>>>>> sensitive_func() is called (you can retrieve that through the
>>>>> current ExplodedNode) and find that the CFGBlock whose
>>>>> getLastCondition() is value < xxx is in fact a control
>>>>> dependency. Then, you could, in theory, check whether parts of
>>>>> this expression is tainted.
>>>>>
>>>>> Artem, do you think this makes any sense?
>>>>>
>>>>> On Fri, 20 Sep 2019 at 16:10, Gavin Cui via cfe-dev
>>>>> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>>>>
>>>>> Hello all,
>>>>> I want to check if a tainted value can affect the control
>>>>> flow of some sensitive functions. For example:
>>>>>
>>>>> value = taint_source()
>>>>> if (value < xxx) {
>>>>> sensitive_func()
>>>>> }
>>>>>
>>>>> The taint propagation in clang static analyzer fit part of
>>>>> my need. One approach I can think of is:
>>>>> Whenever I encounter a branch condition (register
>>>>> checkBranchCondition() call back), I will push a
>>>>> tag(tainted or not) to a taintStack variable in ProgramState.
>>>>> After the branch block closed, I will pop one tag.
>>>>> If sensitive_function() get encountered, I will check all
>>>>> the tags in taintStack to see if any of them is tainted.
>>>>>
>>>>> The problem is I did not find a callback like
>>>>> checkBranchCondition() which will be called every time
>>>>> exiting a branch block. Then what should be a good
>>>>> approach for this control flow checking?
>>>>>
>>>>> Any suggestions would be appreciated.
>>>>>
>>>>> Thank you,
>>>>> Gavin
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190923/47a5e900/attachment.html>
More information about the cfe-dev
mailing list