[cfe-dev] Buffer bounds checking for C in static analysis

Chris Hacking chacking at cs.washington.edu
Tue Jan 19 04:06:26 PST 2010

Hi Zhongxing,

Thanks for the quick response. If I’m understanding you correctly, here are couple of cases that the bounds checker would miss:
	void setfive (int* x) { x[5]=5; }
	int a[4]; setfive(a);
(due to lack of inter-procedural analysis), and
	int b[5]; int x;
	for (x=0; x<=5; x++) { b[x]=x; }
(due to lack of loop analysis).

If I’ve misunderstood, please let me know. Considering the problem briefly, a few other areas that are potentially problematic are alias analysis:
	int c[5]; int* y = c; y[5]=50;
pointer arithmetic:
	char s[10]; s++; s[9] = ‘\0’;
dynamically sized arrays:
	int len = atoi(“5”); char* str = malloc(len); str[len] = ‘\0’;
although all could be interesting extensions to current checks.

While the exact implementation for handling the string manipulation functions would require some thought (and examination of the current analysis tool), they do seem important to test. The most dangerous buffer overflows typically seem to involve them in some way. To extend that idea somewhat, the string.h functions aren’t the only ones that can trigger buffer overflows; in particular many system calls write into buffers:
	char e[255];
	int s = socket(<blah>);
	connect(s, <blah>);
	recv(s, e, 256, 0);

Anyhow, it sounds like there’s just enough groundwork here that we’d be able to make very meaningful contributions within the time frame available. I’m curious whether your comment on inter-procedural analysis means it’s not currently implemented in the Clang analyzer at all, or that it’s limited to specific checks right now. It seems like it would be near-critical to some checks.

In any case, thanks again. We need to create a specific project proposal (in the next few days), which is why I’m asking for more specifics on what is or is not checked currently. If there’s any documentation you could point me at that would help too. I looked at the source file but it doesn’t describe any limitations of the environment in which it operates.

Chris Hacking

From: Zhongxing Xu [mailto:xuzhongxing at gmail.com] 
Sent: Monday, 18 January 2010 11:35 PM
To: Chris Hacking
Cc: cfe-dev at cs.uiuc.edu
Subject: Re: [cfe-dev] Buffer bounds checking for C in static analysis

Hi Chris,

The current clang static analyzer only has a basic array bounds checker (ArrayBoundChecker.cpp). Its function is very simple: whenever a location is visited, if it's an array element, compare its index with the size of the array. Some errors it can detect:

int a[4];
a[4] = 3;

int *p = malloc(12);
p[3] = 3;

That is, the current analyzer has the basic infrastructure for detecting out-of-bound array access.

The complexity of bounds checking arises from two facts: (may be there are others)

a. Many out-of-bound accesses occur in loops. Depending on the length of the buffer, it may require looping many times before triggering the out-of-bound access. But currently we only unwrap the loop for 2 or 3 times.

b. There are several string manipulation functions, such as strcpy(), strcat(). Currently we don't handle them. One simple way is to provide a fake implementation of them and inline them into the call site, and treat them as normal loops. Another way is to create some linear constraints over the arguments of them, and solve it. But this has complexity that we need to model the actual length of the string and the size of the array.

Some other projects includes integer overflow checking. This has not been implemented at all in clang.

Also inter-procedural analysis is required to enhance all other checks.
2010/1/19 Chris Hacking <chacking at cs.washington.edu>
Hi all,

I'm a student (part of a group of 3) looking for a project involving
software engineering tools, preferably static analysis. I do a lot of work
in C and have long felt that a static analysis tool for bounds checking on
memory buffers (arrays/strings) would be very helpful. I saw that this was a
requested feature for the Clang analysis tool, but there's very little info
and it's apparently not fully developed yet. Therefore I have two questions:

What is the state of bounds checking for C in the Clang analyzer, in terms
of how far it has gotten and how much work is progressing on it?

Is there another static analysis area that the Clang static analyzer needs
implemented that would be a reasonable project for a few CS grad students?

Chris Hacking

There's no place I can be,
Since I found Serenity.
But you can't take the sky from me.

cfe-dev mailing list
cfe-dev at cs.uiuc.edu

More information about the cfe-dev mailing list