[cfe-dev] [analyzer] Constrain the size of unknown memory regions
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Sun Mar 15 19:08:29 PDT 2020
I think you're looking for SymbolExtent. It is *the* symbol that denotes
the [otherwise completely] unknown size of a region. The helper function
for obtaining either a known extent or a SymbolExtent is currently known
as getDynamicSize() (but i'd rather rename it back to "extent" - see
also https://reviews.llvm.org/D69726).
ArrayBoundChecker already has the code that you're looking for.
On 3/12/20 3:20 PM, Balázs Benics via cfe-dev wrote:
> Hi, checker devs
>
> TLDR:
> How to constrain the size of unknown memory regions, eg. pointed by
> 'raw' char pointers?
>
> longer version:
> Working on taint analysis I'm facing with the following problem:
>
> void strncpy_bounded_tainted_buffer(char *src, char *dst) {
> // assert(strlen(src) >= 10 && "src must have at leas 10 elements");
> int n;
> scanf("%d", &n);
> if (0 < n && n < 10) {
> strncpy(dst, src, n); // Should we warn or not?
> }
> }
>
>
>
> In this example we analyze a function taking two raw pointers, so we
> don't know how big those arrays are.
> We will check the `strncpy` call, whether it will access /(read and
> write)/ only valid memory.
> We will check the pointers (src and dst) separately by checking if
> /`/&src[0]` and `&src[n-1]` would be in bound of the memory region
> pointed by the pointer. Since the analyzer don't know (both states are
> non-null), we should check if the `length` parameter is tainted, and
> if so, we should still issue a warning telling that "String copy
> function might overflow the given buffer since untrusted data is used
> to specify the buffer size."
> Obviously, if the `length` parameter is not tainted, we will assume
> (conservatively) that the access would be valid.
>
>
> How should tell the analyzer that the array which is pointed by the
> pointer holds at least/most N elements?
> For example in the code above, express something similar via an
> assertion, like saying that `src` points to a c-string, which has at
> least 10 + 1 element underlying storage.
> Although this assertion using `strlen` seems like a solution,
> unfortunately not applicable for example to the `dst` buffer, which is
> most likely uninitialized - so not a c-string, in other words calling
> `strlen` would be undefined behavior.
>
> The only (hacky) option which came in my mind was to abuse the
> standard regarding pointer arithmetic.
>
> assert(&src[n] - &src[-1]);
>
> The standard is clear about that pointer arithmetic is only applicable
> for pointers pointing to elements of the same array OR to a
> hypothetical ONE past/before element of that array.
> http://eel.is/c++draft/expr.add#4.2
>
> This assertion would be undefined behavior if the size of the array
> pointed by `src` would be smaller than `n`.
>
> IMO this looks really ugly.
> I think that no '/annotations/' should introduce UB even if that
> assumption expressed via an annotation is turned out to be _invalid_.
>
>
> What would be the right approach to guide (to constrain the size of a
> memory region) the analyzer?
> How can the analyzer inference such constraint?
>
> Thanks Balazs.
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
More information about the cfe-dev
mailing list