[cfe-dev] improving the ArrayBoundChecker.

Ted Kremenek kremenek at apple.com
Wed Feb 12 12:36:41 PST 2014


Hi Per,

Both versions of the array bound checker try to tackle the same problem differently.  Fundamentally both of them are limited, however, but limits in the constraint solver.  That’s why they’ve both been in a holding pattern.

Fundamentally, buffer overflow detection reduces to reasoning about the following equation:

  $index * sizeof(element) < $length * sizeof(element)

The problem here is that both $index and $length may be symbolic values that our constraint solver needs to reason about.  Reasoning about this equation runs up against two limitations in the constraint solver:

(1) The constraint solver can only reason about constraints involving a single symbolic value and constant values.  For example, $a < 10, $a > 5, etc.  It cannot reason about constraints between two symbols, in this case $index < $length.  If $length is fixed, the constraint solver can possibly reason about this.

(2) The constraint solver has limitations in reasoning about constraints involving multiplying a symbolic value by a constant, and then comparing that against another value.  In this case, "$index * sizeof(element) < some value”.

In general, the problem here is that the constraint solver cannot reason about affine constraints between symbolic values.  If it could, then writing the array bounds checker would be quite doable.

 ArrayBoundChecker and ArrayBoundCheckerV2 approach these limitations in different ways.  ArrayBoundCheckerV2 tries to do all of its bounds checking in terms of “raw offsets”, which means that all computations are in terms of bytes.  This lowering is really useful if you want to reason about array bounds checking where an array with a given type is casted to an array with a different type (e.g., int[] to char[]).  ArrayBoundCheckerV2 can conceptually reason about buffer overflows in the presence of such type punning, but it then hits up against more of the limitations of the constraint solver because it reduces to solving the equation above (whereas ArrayBoundChecker needs to reason about $index < $length, but that only works if you haven’t casted the array to a different type).

To summarize, ArrayBoundCheckerV2 is probably the right long term approach, but because it hits more of the limitations of the constraint solver it finds fewer issues than ArrayBoundChecker.

In the short term, making ArrayBoundChecker more useful is probably the better investment.  If it could be used to find buffer overflows in just fixed size arrays, it would still be useful, and likely would hit fewer limitations in the constraint solver. That would not be wasted effort, as once the constraint solver evolves we could pull the work done on ArrayBoundChecker into ArrayBoundCheckerV2 (or whatever successor we come up with once the constraint solver is farther along).

Ted

On Feb 11, 2014, at 7:19 AM, Per Viberg <Per.Viberg at evidente.se> wrote:

> // oops, sent this by mistake to the commits list..
> 
> 
> Hi,
> 
> I am going to look into the ArrayBoundChecker. Currently it doesn't handle symbolic constraints on indexing with '>' or '<' properly. 
> 
> There are two implementation of the ArrayBoundChecker, and some errors are not detected with the V2, (see below), so my main question is: Which of the ArrayBoundV2 or ArrayBound is the one that should be improved on? Also, any specific class/interfaces that should/shouldn't be used?. Any other useful hints on where to fix this are also highly appreciated.
> 
> cheers!,
> 
> Per
> 
> 
> 
> The ArrayBound check doesn't find this error:
> 
> void writeOutOfBoundsGEConstraint(int x) {
>   int a[10];
>   if (x >= 10)
>      a[x] = 5;  // ← 1. write access GE does not generate error
> }
> 
> It does however find the read access with '>', but only using ArrayBound, _not_ ArrayBoundV2.
> 
> int readOutOfBoundsGEConstraint(int x) {
>   int a[10];
>   int g=5;
>   if (x >= 10)
>     g = a[x];   // ← 2. read access GE generate error on V0
>   return g;
> }
> 
> out-of-bounds_2.c:13:9: warning: Access out-of-bound array element (buffer overflow)
>     g = a[x];   // ← 2. read access GE generate error on V0
>         ^~~~
> 
> The rest below are found in both ArrayBound and ArrayBoundV2:
> 
> void writeOutOfBoundsEQ(int x) {
>   int a[10];
>   if (x == 10)
>      a[x] = 0;  // ← 3. write access EQ generate error on V0 and V2
> }
> 
> out-of-bounds_2.c:20:11: warning: Access out-of-bound array element (buffer overflow)
>      a[x] = 0;  // ← 3. write access EQ generate error on V0 and V2
>      ~~~~ ^
> 
> int readOutOfBoundsEQ(int x) {
>   int a[10];
>   int g=5;
>   if (x == 10)
>     g = a[x];   // ← 4. read access EQ generate error on V0 and V2
>   return g;
> }
> 
> out-of-bounds_2.c:26:9: warning: Access out-of-bound array element (buffer overflow)
>     g = a[x];   // ← 4. read access EQ generate error on V0 and V2
>         ^~~~
> 
> 
> .......................................................................................................................
> Per Viberg Senior Engineer
> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
> Phone:    +46 (0)8 402 79 00
> Mobile:    +46 (0)70 912 42 52
> E-mail:     Per.Viberg at evidente.se 
> 
> www.evidente.se
> This e-mail, which might contain confidential information, is addressed to the above stated person/company. If you are not the correct addressee, employee or in any other way the person concerned, please notify the sender immediately. At the same time, please delete this e-mail and destroy any prints. Thank You.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140212/ea1818e7/attachment.html>


More information about the cfe-dev mailing list