[PATCH] Teach the analyzer about multi-dimensional VLAs

Thu Oct 16 09:21:56 PDT 2014

On Oct 14, 2014, at 15:12 , Daniel Fahlgren <daniel at fahlgren.se> wrote:

> Hi Jordan,
> 
> On ons, 2014-10-08 at 20:05 -0700, Jordan Rose wrote:
> 
>> +  // First only check for errors. That way we will find problems even
>> if one of
>> +  // the dimensions is unknown.
>> +  const Expr *SE;
>> +  QualType QT;
>> +  do {
>> +    // Check size expression.
>> +    SE = VLA->getSizeExpr();
>> +    State = checkSizeExpr(SE, State, C);
>> +    if (!State)
>> +      return;
> 
>> Seems like a worthy goal. If we're finding problems even if one
>> dimension is unknown, though, is it worth finding problems in every
>> size expression, rather than exiting early when we find a problem in
>> the first one?
> 
> Good point. I'm not sure I solved this the correct way. It seems like
> it only is possible to create one sink node so I had to call
> getPredecessor(). Is that the right way or how should I do to emit
> multiple errors? 

It's more just the normal uniquing of identical nodes kicking in—if we've somehow already gotten to this state, that's supposed to mean we've already emitted any errors. I think that means you should collect all the errors up front and then emit them all on the same sink node. Emitting them on the predecessor would be valid but not really what we want, because it could have the wrong location.

>> +
>> +    // Check next dimension.
>> +    QT = VLA->getElementType();
>> +    VLA = Ctx.getAsVariableArrayType(QT);
>> +  } while(VLA);
> 
>> Nitpick: can you put a space after the "while"? (here and below)
> 
> Ok.
> 
>> +  // Record the state so far, in case one of the lengths is unknown.
>> +  C.addTransition(State);
> 
>> Unfortunately we can't actually do this—you end up recording
>> two transitions here, rather than replacing the first one. Maybe the
>> size calculation should be factored out into a helper function as
>> well?
> 
>> +    // Multiply the current size by the element size.
>> +    ArraySizeVal = SvalBuilder.evalBinOpNN(State, BO_Mul,
>> ArrayLength,
>> +      ArraySizeVal.castAs<NonLoc>(), SizeTy);
> 
>> Can you add a FIXME in here to check for overflow? That's a security
>> risk, really.
> 
> I've broken out the calculation and added the FIXME, but that got me
> thinking. What is a too large size? If the size is larger than a size_t
> can represent we surely got a problem. We will most likely run out of
> stack space way before that, however.

On platforms where RSIZE_MAX exists and is smaller than SIZE_MAX the "most correct" answer is probably RSIZE_MAX. If not that, then we could perhaps make a policy of using SSIZE_MAX (or SIZE_MAX >> 1, which is how RSIZE_MAX is defined on my system anyway). But even just the simple case of overflowing a size_t would probably be good enough.

Again, though, let's worry about this later. :-)

>> As for the tests, can you add some assertions that we really do
>> multiply the size out correctly? For example, using a too-large index
>> should warn, casting to char * and assigning outside the buffer should
>> warn, and sizeof should give the right answer.
> 
> That turned out to be easier said than done. The analyzer doesn't seem
> to handle sizeof on things that can't be resolved compile time (like
> VLAs)? I've also added some FIXMEs for ArrayBoundV2. Casting to char *
> works as expected.

Huh, I thought we handled sizeof for a VLA by using the extent, but I guess not. Out-of-bounds checks it is.

Jordan