[cfe-commits] strncpy checker - proposed patch

Fri Feb 11 20:46:17 PST 2011

Hi Lenny,

This is looking better.  The patch doesn't apply cleanly to TOT, so would it be possible to regenerate it?  Some of the patch doesn't really match with the current contents of the checker, so it's hard to evaluate.

A few comments:

- Could you add comments about what the 'IsPotential' flag is for?  This checker is really lacking in comments, and the logic is starting to look really complicated.

- While I can't quite tell because the patch doesn't apply correctly, the following code bothers me a bit:

> +  NonLoc * lenValNL;

> +  SVal lenVal;
> +  bool checkPotentialLen = false;
>    if (isStrncpy) {
>      // Get the max number of characters to copy
>      const Expr *lenExpr = CE->getArg(2);
> -    SVal lenVal = state->getSVal(lenExpr);
> -
> +    lenVal = state->getSVal(lenExpr);
> +    
>      NonLoc * strLengthNL = dyn_cast<NonLoc>(&strLength);
> -    NonLoc * lenValNL = dyn_cast<NonLoc>(&lenVal);
> +    lenValNL = dyn_cast<NonLoc>(&lenVal);
... <SNIP>
> +    // Max number to copy is greater than the length of the src buffer. So
> +    // also check that it is still <= length of destination buffer.
> +    if (checkPotentialLen) {
> +      SVal lastElement =
> +        C.getSValBuilder().evalBinOpLN(state, BO_Add, *dstRegVal,
> +                                       *lenValNL, Dst->getType());
> +      
> +   

I'm not a huge fan of declaring variables (e.g., lenValN:) and conditionally initializing them on one branch, and then conditionally using them later on another branch.  I often feel that makes the logic of the checker not well-composed, difficult to follow, and error prone.  I can't make more specific comments since the patch doesn't apply cleanly, but if the method probably could be further factored into additional methods where the shared logic was composed using calls to sub-functions rather than a bunch of branches it would honestly be much easier to follow.

On Feb 4, 2011, at 4:39 PM, Lenny Maiorani wrote:

> 
> On Feb 3, 2011, at 3:13 PM, Lenny Maiorani wrote:
> 
>> 
>> On Dec 21, 2010, at 9:52 AM, Ted Kremenek wrote:
>> 
>>> Hi Lenny,
>>> 
>>> Thank you for your patience.  Overall the patch looks great, but I'm a little confused about the following section:
>>> 
>>>>    // Get the string length of the source.
>>>>    SVal strLength = getCStringLength(C, state, srcExpr, srcVal);
>>>>  
>>>> +  if (isStrncpy) {
>>>> +    // Check if the number of bytes to copy is less than the size of the src
>>>> +    const Expr *lenExpr = CE->getArg(2);
>>>> +    strLength = state->getSVal(lenExpr);
>>>> +  }
>>>> +
>>>>    // If the source isn't a valid C string, give up.
>>>>    if (strLength.isUndef())
>>>>      return;
>>> 
>>> This looks like an intermingling of logic that it's not clear should compose is this way.
>>> 
>>> At the beginning we (a) fetch a value for 'strLength', then (b) overwrite that value if 'isStrncpy' is true, and then (c) we check if strLength is undefined.   Both (a) and (b) look like competing logic.  If they are truly mutually exclusive, I rather have one, but not both, get computed.  This logic also looks slightly pessimistic, as the length of the string can be smaller than the max number of bytes specified to strncpy().  If the value retrieved at (a) is less than the value retrieved at (b), should we use the strLength from (a) and not (b)?  I can see the argument to always use the most pessimistic value, but then our error reporting should probably reflect that 'size_t n' argument to strncpy() is too large, and not necessarily that we have a buffer overflow.  That would make it clearer to the user what they actually need to fix in their code (i.e., while it might not be a buffer overflow, it is one waiting to happen, etc.).
>>> 
>>> Overall, this looks great.  I'd just like to iron these last details out a bit (and document the final design decision in the code itself with comments) so it's clear the checker is always doing what you intend and that the user understands why they are getting a warning for their code.
>>> 
>>> Cheers,
>>> Ted
>> 
>> After a hiatus, I am back. Ted, you were correct. My patch was pessimistic. I have modified it to accurately reflect whether or not there is a buffer overflow. Now, it compares the size of the src buffer and the value of the size_t and takes the smaller. See attached patch.
>> 
>> Maybe there should be an additional check to see if the size_t (3rd arg) is larger than the size of dst. This would be more of a potential logic error waiting to happen when the code changed sometime in the future. This patch does not contain that.
>> 
>> -Lenny
>> <strncpy-checker.diff>
>> 
>>        __o
>>      _`\<,_
>>     (*)/ (*)
>> ~~~~~~~~~~~~~~~~~~~~
>> 
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> 
> This patch extends my previous patch to also check for a separate pessimistic case. It ensures that the size_t n (3rd arg to strncpy()) is less than the size of the destination buffer. It contains a different warning message than the other strict buffer overruns since this one is not actually a buffer overrun, only a chance of a buffer overrun in the future.
> 
> -Lenny
> 
> <strncpy-pessimistic-checker.diff>
> 
> 
>        __o
>      _`\<,_
>     (*)/ (*)
> ~~~~~~~~~~~~~~~~~~~~
> 
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110211/aaf7e7b0/attachment.html>