<div dir="ltr"><div dir="ltr"><div>Hi Artem and Csaba,</div><div>Thank you for your response.</div><div><br></div></div><div>Sorry for the long description of the problem in my original email, but I wanted to provide as much context as I could.</div><div>Provided that we have to analyze code like in that example, the analyzer (correctly) assumes that the pointer points to an unknown memory region.</div><div>However, the user knows that the function will be called with a valid buffer, which would be capable of holding at least/most n elements. For now, we can not tell this assumption to the analyzer. There is no way in standard C++ to express such notion precisely. Asserts would not be powerful enough, as I earlier stated.</div><div>So we have to tell this kind of assumption to the analyzer in a different way. But in what way?</div><div><br></div><div>Some sort of annotation or special analyzer assert? I'm not sure, but none of these look promising, really.<br></div><div><br></div><div>I think we should find a way to properly analyze the following two cases:<br></div><div><ul><li><span style="font-family:monospace">`int f_user_function(int *points_to_at_least_5_elements);`<br><span style="font-family:arial,sans-serif">The user knows that this function must be called with a pointer pointing to an array capable of holding at least 5 elements.<br>We should be able to tell this assumption to the analyzer, to analyze its body according to this assumption.<br></span></span></li></ul></div><div><ul><li><span style="font-family:monospace">`int g_user_function(int *buff, int size);`<br>
<font face="arial,sans-serif">There is a connection between the <span style="font-family:monospace">`buff`<span style="font-family:arial,sans-serif"> and <span style="font-family:monospace">`size`</span>, which denotes similar properties that were described in the previous bullet point, but the analyzer would not know.</span></span></font></span><br></li></ul></div><div>Regards Balazs.<br></div><div><div><br></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com">noqnoqneo@gmail.com</a>> ezt írta (időpont: 2020. márc. 16., H, 3:08):<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I think you're looking for SymbolExtent. It is *the* symbol that denotes <br>
the [otherwise completely] unknown size of a region. The helper function <br>
for obtaining either a known extent or a SymbolExtent is currently known <br>
as getDynamicSize() (but i'd rather rename it back to "extent" - see <br>
also <a href="https://reviews.llvm.org/D69726" rel="noreferrer" target="_blank">https://reviews.llvm.org/D69726</a>).<br>
<br>
ArrayBoundChecker already has the code that you're looking for.<br>
<br>
On 3/12/20 3:20 PM, Balázs Benics via cfe-dev wrote:<br>
> Hi, checker devs<br>
><br>
> TLDR:<br>
> How to constrain the size of unknown memory regions, eg. pointed by <br>
> 'raw' char pointers?<br>
><br>
> longer version:<br>
> Working on taint analysis I'm facing with the following problem:<br>
><br>
> void strncpy_bounded_tainted_buffer(char *src, char *dst) {<br>
> // assert(strlen(src) >= 10 && "src must have at leas 10 elements");<br>
> int n;<br>
> scanf("%d", &n);<br>
> if (0 < n && n < 10) {<br>
> strncpy(dst, src, n); // Should we warn or not?<br>
> }<br>
> }<br>
><br>
><br>
><br>
> In this example we analyze a function taking two raw pointers, so we <br>
> don't know how big those arrays are.<br>
> We will check the `strncpy` call, whether it will access /(read and <br>
> write)/ only valid memory.<br>
> We will check the pointers (src and dst) separately by checking if <br>
> /`/&src[0]` and `&src[n-1]` would be in bound of the memory region <br>
> pointed by the pointer. Since the analyzer don't know (both states are <br>
> non-null), we should check if the `length` parameter is tainted, and <br>
> if so, we should still issue a warning telling that "String copy <br>
> function might overflow the given buffer since untrusted data is used <br>
> to specify the buffer size."<br>
> Obviously, if the `length` parameter is not tainted, we will assume <br>
> (conservatively) that the access would be valid.<br>
><br>
><br>
> How should tell the analyzer that the array which is pointed by the <br>
> pointer holds at least/most N elements?<br>
> For example in the code above, express something similar via an <br>
> assertion, like saying that `src` points to a c-string, which has at <br>
> least 10 + 1 element underlying storage.<br>
> Although this assertion using `strlen` seems like a solution, <br>
> unfortunately not applicable for example to the `dst` buffer, which is <br>
> most likely uninitialized - so not a c-string, in other words calling <br>
> `strlen` would be undefined behavior.<br>
><br>
> The only (hacky) option which came in my mind was to abuse the <br>
> standard regarding pointer arithmetic.<br>
><br>
> assert(&src[n] - &src[-1]);<br>
><br>
> The standard is clear about that pointer arithmetic is only applicable <br>
> for pointers pointing to elements of the same array OR to a <br>
> hypothetical ONE past/before element of that array.<br>
> <a href="http://eel.is/c++draft/expr.add#4.2" rel="noreferrer" target="_blank">http://eel.is/c++draft/expr.add#4.2</a><br>
><br>
> This assertion would be undefined behavior if the size of the array <br>
> pointed by `src` would be smaller than `n`.<br>
><br>
> IMO this looks really ugly.<br>
> I think that no '/annotations/' should introduce UB even if that <br>
> assumption expressed via an annotation is turned out to be _invalid_.<br>
><br>
><br>
> What would be the right approach to guide (to constrain the size of a <br>
> memory region) the analyzer?<br>
> How can the analyzer inference such constraint?<br>
><br>
> Thanks Balazs.<br>
><br>
> _______________________________________________<br>
> cfe-dev mailing list<br>
> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
<br>
</blockquote></div></div></div>