[cfe-dev] Clang GenericTaintChecker limitations

Divya Muthukumaran via cfe-dev cfe-dev at lists.llvm.org
Thu Aug 11 10:07:27 PDT 2016


Hi Artem,

I'm not sure what the protocol is for posting code here. I was trying to
abstract the behaviour of a well-known
in memory key-value store into the following program so this may be too
much code to post here. Let me know
if you want me to give you an even more abstract version. And again, thanks
for your help!

/----------- test.c -------------/

#include<stdio.h>
#include<string.h>
#include<stdlib.h>

typedef struct _item{
int key;
        char value;
struct _item * next;
} item;

void taint_add(item ** it);

item * global_item_list  = NULL;

item * alloc_item(int key, char value) {
item * new_item = malloc(sizeof(item));
new_item->key = key;
new_item->value = value;
new_item->next = NULL;
return new_item;
}


item * get_item(int key) {
      *  item * list = global_item_list; ---- (D) *
while(list) {
if(key == list->key)
return list;
list = list->next;
}
return NULL;
}

void put_item(item * new_item) {
if(!global_item_list) {
*global_item_list* = *new_item*; -----* (C)*
}
else {
new_item->next = global_item_list;
global_item_list = new_item;
}
        return;
}

int dispatch_op(char * command_str) {
char *token;
    const char s[2] = " ";
token = strtok(command_str, s);
int key;
char value;
item * it;

if(token == NULL) {
printf ("[Error] No command entered."
"Please enter a valid command\n");
return 1;
}

  if(!strcmp(token, "SET")) {
/* Grab the key */
int key;
token = strtok(NULL, s);
        if( token == NULL ) {
printf ("[Error] Invalid number of arguments."
"Please reenter command with correct params\n");
return 1;
} else {
key = atoi(token);
if(key == 0) {
printf("[Error] Key must be greater than 0\n");
return 1;
}
}
/* Grab the value */
char value;
token = strtok(NULL, s);
        if( token == NULL ) {
printf ("[Error] Invalid number of arguments."
"Please reenter command with correct params\n");
return 1;
} else {
value = *token;
}
item * it = alloc_item(key,value);
*taint_add(&it); ---- (A) *
*put_item(it);  ----- (B) *
printf("[Success] Item added.\n");
return 1;
} else if (!strcmp(token,"GET")) {
/* Grab the key */
int key;
token = strtok(NULL, s);
        if( token == NULL ) {
printf ("[Error] Invalid number of arguments."
"Please reenter command with correct params\n");
return 1;
} else {
key = atoi(token);
}
*item *it = get_item(key);  --- (E) *
if(it != NULL)
printf("[Success] Item Found: Key %d, Value %c\n", it->key, it->value);
else printf("[Failure] Item Not found.\n");
return 1;
} else if (!strcmp(token, "QUIT")) {
printf("GoodBye.\n");
return 0;
}
else {
printf("[Error] Unknown command: Enter a valid command\n");
return 1;
}

}


/* Command are GET, SET, QUIT */
void process_command() {
char *buffer = NULL;
int read;
unsigned int len;
do {
do {
printf("Please Enter a command: "
"GET key::int(>0) | SET key::int(>0) val::char | LIST | QUIT\n");
read = getline(&buffer, &len, stdin);
buffer[strcspn(buffer, "\n")] = 0;
} while (-1 == read);
} while(dispatch_op(buffer));

return;
}


int main(int argc, char ** argv){
* /* item * A = alloc_item(1,'A'); *
* item * B = alloc_item(2,'B'); *
* taint_add(&B); -- (P) *
* put_item(A); *
* put_item(B);*
* item * it = get_item(2); -- (Q) *
* printf("Value of %d is %c\n", it->key, it->value); --(R)  */ *


process_command();
return 0;
}



I run the analysis using the command: /llvm-install/bin/scan-build -k
-enable-checker alpha.security.taint.TaintPropagation -enable-checker
debug.TaintTest clang `-cc1 -analyzer-max-loop=10` test.c
I have added the function taint_add() as one of the sources of taint inside
GenericTaintChecker. I was expecting to see the taint flow from A -> B -> C
-> D -> E highlighted above with the reasoning that
the dispatch_op() function can multiplex between SET(put_item) and
GET(get_item) requests and taint introduced though the SET request can flow
to global variable and then the result of the PUT request
but this doesn't happen.

However, if I uncomment the highlighted commented lines in main, I can see
that taint flow from P -> Q -> R.


On Thu, Aug 11, 2016 at 5:04 PM, Artem Dergachev <noqnoqneo at gmail.com>
wrote:

> On 8/11/16 3:42 PM, Gábor Horváth wrote:
>
>> Note that the analyzer do not reason about global variables right now.
>>
>
> @Gábor: Hmm, what do you mean? :o They're present in the Store and work
> like all other variables, they're just invalidated too often (on every
> unmodeled function call). If the variables are also const-qualified, then
> they shouldn't be invalidated, and should always resolve to their initial
> value (though i think there were some bugs there).
>
> @Divya: if you think that your own API functions themselves do unnecessary
> invalidation (rather than user-defined functions or library functions),
> then you have an option to `evalCall` them - that's a special checker
> callback in which you can take care of all modeling, but
>
> And also not that there are no guarantees about the coverage. Therr might
>> be code that is not covered by the analysis at all.
>>
>
> @Gábor: Yeah, it might be that as well. The loop might have been to
> complex, and the analyzer didn't find the proper path through the loop
> (loops are currently inlined as well.
>
> @Divya: you may want to increase the `-cc1 -analyzer-max-loop=4` option to
> a higher value). In the worst case, i'd have had a look at the
> ExplodedGraph (http://clang-analyzer.llvm.org/checker_dev_manual.html#visu
> alizing) to see what exactly is going on.
>
> It might also easily be something else, so if you can post some sample
> code, we'd probably make a better guess.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160811/304781c5/attachment.html>


More information about the cfe-dev mailing list