[cfe-commits] PATCH: Enhance array bounds checking

Kaelyn Uhrain rikka at google.com
Fri Aug 5 15:42:37 PDT 2011


Alas updating to current SVN was rather painful since the changes for
specially handling 1-element arrays that occur as the last field of a
structure had gone in since Monday. :'(

Once I finish the merging I'll try to add meaningful comments above the
calls to CheckArrayAccess() and submit the changes.

Cheers,
Kaelyn

On Fri, Aug 5, 2011 at 3:01 PM, Ted Kremenek <kremenek at apple.com> wrote:

> Thanks for the ping.  I thought about this a fair amount over the last few
> days, and your explanation seems reasonable.  I think the logic is good as
> is, and ready to go in.  The only think I would possibly "improve" is adding
> a few comments above the calls to CheckArrayAccess() that indicate what that
> specific call is checking.  That would take some of the mystery out of the
> choice of placement of these calls.
>
> Thanks for working on this!
>
>
> On Aug 5, 2011, at 2:35 PM, Kaelyn Uhrain wrote:
>
> ping?
>
> On Mon, Aug 1, 2011 at 3:33 PM, Kaelyn Uhrain <rikka at google.com> wrote:
>
>> Hi Ted,
>>
>> I've gone ahead and attached the full patch to facilitate seeing all of
>> the changes in better context.
>>
>>
>> On Mon, Aug 1, 2011 at 2:18 PM, Ted Kremenek <kremenek at apple.com> wrote:
>>
>>> Hi Kaelyn,
>>>
>>> Sorry for the delay in responding.  I'm trying to understand the
>>> differences proposed between your last patch and the new one.  When I look
>>> at the "delta from patch set" from 2 to 3, I see changes that appear
>>> unrelated to this feature, which I find confusing.
>>>
>>> What would help me is if I understood more of the high-level motivation
>>> of the changes.  What I see is calls to "CheckArrayAccess" moved to various
>>> places, for example when checking the comma operator:
>>>
>>>  7578   CheckArrayAccess(lhs.get());
>>>  7579   CheckArrayAccess(rhs.get());
>>>
>>> This feels a little kludgy.  I'm not saying it is wrong; I just don't
>>> understand the motivation.  Could you provide some insight behind these
>>> changes other than they got the particular test cases working?
>>>
>>
>> The changes to get everything working properly (from what I had committed
>> to svn and then reverted) are, except for the unary operator handling in
>> CheckArrayAccess, pretty much just shuffling around when CheckArrayAccess is
>> called. I agree that it feels a bit kludgy, but unfortunately it was the
>> only way I could get CheckArrayAccess called once and only once for each
>> array access for all the cases I could come up with (see the changes to
>> test/SemaCXX/array-bounds.cpp).
>>
>> For example, with my original patch that had to be reverted "p =
>> &u.a[3];", "++foo[5];", and "return --foo[6];" from array-bounds.cpp (lines
>> 44, 185, and 187) do not trigger any warning. Additionally, "p == &u.a[2]"
>> triggers a false warning (where "a" is a 2-element array). There are a few
>> places like that where the warnings are triggered incorrectly, but even more
>> places where CheckArrayAccess is never called--such as on the right hand
>> side of assignments. It not being called at all in some cases is why e.g.
>> line 43 "short *p = &u.a[2];" did not give a warning.
>>
>> With just the additional changes I made within CheckArrayAccess and
>> leaving the CheckArrayAccess call sites alone, "p = &u.a[3];", "++foo[5];",
>> and "return --foo[6];" from array-bounds.cpp (lines 44, 185, and 187) still
>> do not trigger any warning. The noticeable change is that "*(&u.a[2]) = 1;"
>> and "*(&u.a[3]) = 1;" (lines 45 and 46) generate the warning twice. Likewise
>> just changing where CheckArrayAccess is called, without the additional
>> changes to CheckArrayAccess, leads to quite a few situations where the
>> warning is never properly emitted because CheckArrayAccess was called with
>> an unrecognized type of statement (15 instances in
>> test/SemaCXX/array-bounds.cpp spread over the entire file--first case is on
>> line 9 and the last on line 189).
>>
>> Finally, without either my original or revised patches, there are quite a
>> few cases where the warning is never triggered, such as "int val = a[3];",
>> "p = &u.a[3];", "array[const_subscript] = 0;", "(*array_ptr)[3] = 1;", and
>> "++foo[5];" (lines 28, 44, 51, 67, and 185 of the version of
>> test/SemaCXX/array-bounds.cpp with my patch applied).
>>
>> It took a few days of trial, error, tracing through call stacks, digging
>> through code paths, and general frustration to figure out where to place the
>> calls to CheckArrayAccess to get as much coverage as possible while avoiding
>> emitting the same warnings multiple times. Hope that helps shed some light
>> on the nature of my additional changes.
>>
>> Cheers,
>> Kaelyn
>>
>>
>>> Thanks so much,
>>> Ted
>>>
>>> On Jul 27, 2011, at 4:18 PM, Kaelyn Uhrain wrote:
>>>
>>> Ted & Eric,
>>>
>>> After having to revert my array bounds enhancements (which had gone in as
>>> r136046) due to false positives in cases like &array[array_size], I've
>>> reworked the array bounds checking a bit and think I have it working
>>> correctly now. I've uploaded the full set of changes including my reverted
>>> changes to
>>> http://codereview.appspot.com/4675068 and attached a diff of the changes
>>> from what I'd committed and later rolled back.
>>>
>>> Cheers,
>>> Kaelyn
>>>
>>> On Mon, Jul 25, 2011 at 6:17 PM, Kaelyn Uhrain <rikka at google.com> wrote:
>>>
>>>> Just realized my patch missed the new unit test file. d'oh!
>>>>
>>>>
>>>> On Sat, Jul 23, 2011 at 11:35 AM, Ted Kremenek <kremenek at apple.com>wrote:
>>>>
>>>>> This all sounds good to me.  I think the patch can go in as is.
>>>>>
>>>>> On Jul 21, 2011, at 4:04 PM, Kaelyn Uhrain wrote:
>>>>>
>>>>> Hi Ted,
>>>>>
>>>>> I've attached a new version of my patch that moves the pointer
>>>>> arithmetic portion of the -Warray-bounds improvements into a separate flag,
>>>>> -Warray-bounds-pointer-arithmetic. Since the remaining noise in the pointer
>>>>> arithmetic check is more a case of undefined behavior that does the right
>>>>> thing with most (all?) compilers than a case of being true false positives,
>>>>> the plan is to have -Warray-bounds-pointer-arithmetic initially be disabled
>>>>> by default and in later patches to enhance the warning by e.g. providing a
>>>>> fixit note to add parens to exprs like "ptr + sizeof("foo") - 1" so they
>>>>> become "ptr + (sizeof("foo")) - 1".
>>>>>
>>>>> Cheers,
>>>>> Kaelyn
>>>>>
>>>>> On Thu, Jul 21, 2011 at 1:18 PM, Ted Kremenek <kremenek at apple.com>wrote:
>>>>>
>>>>>> Hi Kaelyn,
>>>>>>
>>>>>> While I think there will be some contention here, I think this is
>>>>>> still too high of a false positive rate for this warning to be on by default
>>>>>> in the compiler.  Users simply aren't going to tolerate it.  I think it is
>>>>>> reasonable for a codebase to adopt a strict policy for this warning, but I
>>>>>> don't think a 17% false positive rate (or possibly higher) is acceptable for
>>>>>> a default warning for all users of Clang.
>>>>>>
>>>>>> Unless you believe you think there are additional heuristics to drop
>>>>>> the false positive rate down below 5-10%, I think I'm fine with proceeding
>>>>>> as putting this in as an opt-in warning, and then refine from there.
>>>>>>
>>>>>> On Jul 20, 2011, at 11:39 AM, Kaelyn Uhrain wrote:
>>>>>>
>>>>>> I've attached an updated version of my patch that better handles cases
>>>>>> where pointer arithmetic is done after casting a constant-size array to a
>>>>>> pointer for a smaller base type (e.g. casting an int array to char*). Of the
>>>>>> pointer arithmetic warnings, about 24% could be considered false positives;
>>>>>> however, the actual number of false positives is quite small and 2/3 of them
>>>>>> stem from the use of a single macro--if you count those as a single warning
>>>>>> & false positive, the rate drops to 17%. Of the false positives most are
>>>>>> from semi-questionable pointer arithmetic where a constant greater than the
>>>>>> length of the array/pointer is being added to the pointer and some int > 1
>>>>>> being subtracted from it, e.g.:
>>>>>>
>>>>>> void foo(int n) {
>>>>>>   char x[5];
>>>>>>   if (n > 0) bar(x + 6 - n);
>>>>>> }
>>>>>>
>>>>>> Cheers,
>>>>>> Kaelyn
>>>>>>
>>>>>> On Tue, Jul 19, 2011 at 11:44 AM, Kaelyn Uhrain <rikka at google.com>wrote:
>>>>>>
>>>>>>> I'm still looking at the pointer arithmetic warnings to determine
>>>>>>> which are false positives, but I realized I screwed up my previous stats a
>>>>>>> bit as I forgot to account for duplicated/repeated warnings (e.g. the same
>>>>>>> header included in multiple compilation units and so generating the same
>>>>>>> warnings multiple times). For unique warnings, the stats are:
>>>>>>>
>>>>>>> - 16.5% increase in warnings from before my patch (originally
>>>>>>> reported a 24% increase)
>>>>>>> - 47% of those new warnings being about pointer arithmetic
>>>>>>> - 6.7% of all of the warnings emitted with my patch applied are
>>>>>>> concerning pointer arithmetic. (originally reported 8.3%)
>>>>>>> - The new pointer arithmetic warnings represent a 7.7% increase in
>>>>>>> warnings from before my patch, not 10%.
>>>>>>>
>>>>>>> I'll send another email once I have a feel for how noisy the pointer
>>>>>>> arithmetic warnings are.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>  Kaelyn
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 18, 2011 at 5:54 PM, Kaelyn Uhrain <rikka at google.com>wrote:
>>>>>>>
>>>>>>>> Ted,
>>>>>>>>
>>>>>>>> You're welcome. I'll try to figure out what fraction of the pointer
>>>>>>>> arithmetic warnings are false positives (requires a bit of manual digging on
>>>>>>>> my part to determine if the code is indeed buggy or if it is valid /
>>>>>>>> intended). For the overall 24% increase in warnings, keep in mind that over
>>>>>>>> half of that is the existing bounds checking now being applied to cases
>>>>>>>> where it wasn't before, i.e.:
>>>>>>>>
>>>>>>>> char *foo[5];
>>>>>>>> foo[77];  // -Warray-bounds already found
>>>>>>>> &foo[77];  // -Warray-bounds currently misses
>>>>>>>> *foo[77];  // -Warray-bounds currently misses
>>>>>>>>
>>>>>>>> The function that did the bounds checking would never catch the last
>>>>>>>> two cases because it would see a UnaryOperator (in the above cases for the
>>>>>>>> '&' and the '*') and skip the expression instead of looking inside the
>>>>>>>> UnaryOperator expression for the array subscripting.
>>>>>>>>
>>>>>>>> The new pointer arithmetic bounds checking only represents a 10%
>>>>>>>> increase in warnings--and IMHO that is the only part where the number of
>>>>>>>> false positives introduced might be an issue.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Kaelyn
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 18, 2011 at 5:36 PM, Ted Kremenek <kremenek at apple.com>wrote:
>>>>>>>>
>>>>>>>>> Hi Kaelyn,
>>>>>>>>>
>>>>>>>>> Thanks for the statistics.  What would be good to know is what
>>>>>>>>> fraction of these are false positives (i.e., are these all real bugs)?  A
>>>>>>>>> small random sample might be helpful.  A 24% increase in warnings is fairly
>>>>>>>>> substantial, and we don't want to do that unless there is a real benefit.
>>>>>>>>>
>>>>>>>>> Ted
>>>>>>>>>
>>>>>>>>> On Jul 18, 2011, at 5:18 PM, Kaelyn Uhrain wrote:
>>>>>>>>>
>>>>>>>>> Ted,
>>>>>>>>>
>>>>>>>>> On Thu, Jul 14, 2011 at 5:04 PM, Ted Kremenek <kremenek at apple.com>wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The only other issue: should this be controlled under a separate
>>>>>>>>>> warning flag, at least initially so we can experiment with this new warning
>>>>>>>>>> and see how noisy it is?  E.g. "-Warray-bounds-pointer-arithmetic".
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've tested the -Warray-bounds changes against the Google codebase
>>>>>>>>> and my patch increases the number of warnings from -Warray-bounds by 24%. Of
>>>>>>>>> the new warnings, 57.33% are for array indexes that most likely weren't
>>>>>>>>> picked up before because of unary operators like & or * (11.1% of all the
>>>>>>>>> warnings now emitted), and the remaining 42.67% are from out-of-bounds
>>>>>>>>> pointer arithmetic (8.3% of all the warnings from -Warray-bounds).
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Kaelyn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> <array-bounds-enhancement3.diff>
>>>>>>
>>>>>>
>>>>>>
>>>>> <array-bounds-enhancement4.diff>
>>>>>
>>>>>
>>>>>
>>>>
>>> <array-bounds-enhancement-changes.diff>
>>> _______________________________________________
>>>
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110805/5c305ef1/attachment.html>


More information about the cfe-commits mailing list