[cfe-dev] [analyzer] Constrain the size of unknown memory regions

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Sun Mar 15 19:08:29 PDT 2020


I think you're looking for SymbolExtent. It is *the* symbol that denotes 
the [otherwise completely] unknown size of a region. The helper function 
for obtaining either a known extent or a SymbolExtent is currently known 
as getDynamicSize() (but i'd rather rename it back to "extent" - see 
also https://reviews.llvm.org/D69726).

ArrayBoundChecker already has the code that you're looking for.

On 3/12/20 3:20 PM, Balázs Benics via cfe-dev wrote:
> Hi, checker devs
>
> TLDR:
> How to constrain the size of unknown memory regions, eg. pointed by 
> 'raw' char pointers?
>
> longer version:
> Working on taint analysis I'm facing with the following problem:
>
>     void strncpy_bounded_tainted_buffer(char *src, char *dst) {
>       // assert(strlen(src) >= 10 && "src must have at leas 10 elements");
>       int n;
>       scanf("%d", &n);
>       if (0 < n && n < 10) {
>         strncpy(dst, src, n); // Should we warn or not?
>       }
>     }
>
>
>
> In this example we analyze a function taking two raw pointers, so we 
> don't know how big those arrays are.
> We will check the `strncpy` call, whether it will access /(read and 
> write)/ only valid memory.
> We will check the pointers (src and dst) separately by checking if 
> /`/&src[0]` and `&src[n-1]` would be in bound of the memory region 
> pointed by the pointer. Since the analyzer don't know (both states are 
> non-null), we should check if the `length` parameter is tainted, and 
> if so, we should still issue a warning telling that "String copy 
> function might overflow the given buffer since untrusted data is used 
> to specify the buffer size."
> Obviously, if the `length` parameter is not tainted, we will assume 
> (conservatively) that the access would be valid.
>
>
> How should tell the analyzer that the array which is pointed by the 
> pointer holds at least/most N elements?
> For example in the code above, express something similar via an 
> assertion, like saying that `src` points to a c-string, which has at 
> least 10 + 1 element underlying storage.
> Although this assertion using `strlen` seems like a solution, 
> unfortunately not applicable for example to the `dst` buffer, which is 
> most likely uninitialized - so not a c-string, in other words calling 
> `strlen` would be undefined behavior.
>
> The only (hacky) option which came in my mind was to abuse the 
> standard regarding pointer arithmetic.
>
>     assert(&src[n] - &src[-1]);
>
> The standard is clear about that pointer arithmetic is only applicable 
> for pointers pointing to elements of the same array OR to a 
> hypothetical ONE past/before element of that array.
> http://eel.is/c++draft/expr.add#4.2
>
> This assertion would be undefined behavior if the size of the array 
> pointed by `src` would be smaller than `n`.
>
> IMO this looks really ugly.
> I think that no '/annotations/' should introduce UB even if that 
> assumption expressed via an annotation is turned out to be _invalid_.
>
>
> What would be the right approach to guide (to constrain the size of a 
> memory region) the analyzer?
> How can the analyzer inference such constraint?
>
> Thanks Balazs.
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



More information about the cfe-dev mailing list