[cfe-dev] [analyzer] Constrain the size of unknown memory regions

Balázs Benics via cfe-dev cfe-dev at lists.llvm.org
Thu Mar 12 05:20:34 PDT 2020


Hi, checker devs

TLDR:
How to constrain the size of unknown memory regions, eg. pointed by 'raw'
char pointers?

longer version:
Working on taint analysis I'm facing with the following problem:

void strncpy_bounded_tainted_buffer(char *src, char *dst) {
>   // assert(strlen(src) >= 10 && "src must have at leas 10 elements");
>   int n;
>   scanf("%d", &n);
>   if (0 < n && n < 10) {
>     strncpy(dst, src, n); // Should we warn or not?
>   }
> }
>


In this example we analyze a function taking two raw pointers, so we don't
know how big those arrays are.
We will check the `strncpy` call, whether it will access *(read and write)*
only valid memory.
We will check the pointers (src and dst) separately by checking if *`*&src[0]`
and `&src[n-1]` would be in bound of the memory region pointed by the
pointer. Since the analyzer don't know (both states are non-null), we
should check if the `length` parameter is tainted, and if so, we should
still issue a warning telling that "String copy function might overflow the
given buffer since untrusted data is used to specify the buffer size."
Obviously, if the `length` parameter is not tainted, we will assume
(conservatively) that the access would be valid.


How should tell the analyzer that the array which is pointed by the pointer
holds at least/most N elements?
For example in the code above, express something similar via an assertion,
like saying that `src` points to a c-string, which has at least 10 + 1
element underlying storage.
Although this assertion using `strlen` seems like a solution, unfortunately
not applicable for example to the `dst` buffer, which is most likely
uninitialized - so not a c-string, in other words calling `strlen` would be
undefined behavior.

The only (hacky) option which came in my mind was to abuse the standard
regarding pointer arithmetic.

> assert(&src[n] - &src[-1]);
>
The standard is clear about that pointer arithmetic is only applicable for
pointers pointing to elements of the same array OR to a hypothetical ONE
past/before element of that array.
http://eel.is/c++draft/expr.add#4.2

This assertion would be undefined behavior if the size of the array pointed
by `src` would be smaller than `n`.

IMO this looks really ugly.
I think that no '*annotations*' should introduce UB even if that assumption
expressed via an annotation is turned out to be *invalid*.


What would be the right approach to guide (to constrain the size of a
memory region) the analyzer?
How can the analyzer inference such constraint?

Thanks Balazs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200312/5333751d/attachment-0001.html>


More information about the cfe-dev mailing list