[cfe-dev] Adding taint sources to GenericTaintChecker

Ashwin Ganesh via cfe-dev cfe-dev at lists.llvm.org
Mon Apr 11 06:34:00 PDT 2016


Thanks for the explanation . The use case I had in mind was that readval is
a function loaded from a dynamic library and it returns some value which is
critical ( may or may not depend on external input). But since the
programmer knows it is critical in some sense , he might want to track the
flow of the return value through taint propagation and dump all
instructions which access variables/memory locations that depend on the
initial critical sources. Is there anyway by which I can guarantee those
initial return values to be tainted?

Regards,
Ashwin

On Mon, Apr 11, 2016 at 7:02 PM, Ashwin Ganesh <ashwingane at gmail.com> wrote:

> Thanks for the explanation . The use case I had in mind was that readval
> is a function loaded from a dynamic library and it returns some value which
> is critical ( may or may not depend on external input). But since the
> programmer knows it is critical in some sense , he might want to track the
> flow of the return value through taint propagation and dump all
> instructions which access variables/memory locations that depend on the
> initial critical sources. Is there anyway by which I can guarantee those
> initial return values to be tainted?
>
> Regards,
> Ashwin
>
> On Mon, Apr 11, 2016 at 6:02 PM, Artem Dergachev via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> > int readval()
>> > {
>> >   return 10;
>> > }
>> >
>> > int a,b;
>> > a = readval() // warning : tainted
>> > b = a+1  //warning : tainted
>>
>> In your example, readval() returns 10. Our analysis is inter-procedural,
>> so it knows such things.
>>
>> 10 is a concrete value. A concrete value cannot be tainted - an attacker
>> cannot forge 10 to become 20, or something like that. It's just "the" 10,
>> and all 10's are the same. Something is tainted if it's a user input or is
>> anyhow known to be able to take completely arbitrary values; 10 is not an
>> input from the user, and it's quite under our control. So the analyzer
>> knows for sure that readval() returns a value that cannot be tainted, and
>> the message from the checker gets ignored - this is expressed by the fact
>> that the analyzer was unable to obtain a symbol from the value provided by
>> the checker, because the value is concrete.
>>
>> In fact, only *symbols* may be "truly" tainted. To be exact, addTaint()
>> works with SymExpr's (SymbolRef's) and, additionally, SymbolicRegion's
>> (which are essentially regions pointed to by SymExpr pointers). isTainted()
>> works on SymExpr's, SymbolicRegion's and their sub-regions, and
>> additionally on SVal's of class nonloc::SymbolVal, loc::MemRegionVal,
>> nonloc::LocAsInteger whenever they contain a SymExpr or a SymbolicRegion or
>> its sub-region.
>>
>> If i replace your definition of readval() with an opaque forward
>> declaration, eg:
>>
>>   int readval();
>>   void foo() {
>>     int a = readval() // warning : tainted
>>   }
>>
>> then everything works as expected.
>>
>> On the other hand, if the definition of readval() is truly available in
>> your translation unit, then you don't need to add *it* to
>> GenericTaintChecker - instead, add whatever readval() calls to obtain the
>> user input, and the analyzer would model readval() itself and pass the
>> symbol down to the caller.
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160411/48224872/attachment.html>


More information about the cfe-dev mailing list