[cfe-dev] Static Analyzer Rocks Hard

Thu Jun 19 14:54:51 PDT 2008

On Jun 19, 2008, at 1:03 PM, Matthew Jimenez wrote:

>>
>> In reality, this leak probably wouldn't happen.  If "sql" was  
>> assigned a NULL value because "alloc" failed, chances are that the  
>> allocation to "args" also failed because memory allocation failures  
>> are correlated.  Thus at the end of the basic block for the true  
>> branch of "if (!sql)" we really should have the following  
>> possibilities:
>>
>> (a) both sql and args are null  (allocation failed first for sql,  
>> then args)
>> (b) sql is not null, but args is null (allocation succeeded for  
>> sql, but failed for args)
>> (c) neither sql or args are null (allocation succeeded in both cases)
>>
>> If we assume that these are the only possibilities (and not a  
>> fourth possibility, where allocation failed for sql but succeeded  
>> for args), then there is no leak.
>>
>> Is this the false positive that you were thinking about?  If so,  
>> this extra reasoning about correlated allocation failures is  
>> something that could be implemented fairly easily in the static  
>> analyzer (and I would be happy to do it).
>>
>
> Yes, this was what I was thinking, however I am unsure how I feel  
> about it.
> Obviously any case where the allocation or initialization fail here  
> would cause the code to misbehave, and this is something I had not  
> taken into full consideration.
> Asserting the values after allocating them causes the analyzer to  
> not report any problems, but that might be overkill.

Assertions are funny things.  If you know that the allocation could  
never fail then the assertion properly silences the warning.  If the  
allocation *could* fail, it merely turns an error that could be caught  
with static analysis into a run-time error (i.e., the assertion fires,  
causing the program to halt).

> Perhaps what I'm more interested in are flags for the analyzer for  
> assumptions it may be allowed to make.
> Normal behavior assuming allocation is always working and a flag to  
> assume allocation could fail (or perhaps the reverse).
> Although, too many such flags might also be quite bad.

Most code is not engineered to handle the case when memory allocation  
fails.  When memory allocation starts to fail, programs tend to do  
weird things since they don't know how to recover.  For such programs,  
flagging leaks (or any other kind of bug) that occur when memory  
allocation fails are likely going to be deemed as false positives.

That said, there is code out there that does care when allocations  
fail.  Such code should be designed to handle and recover from  
failure.  For such programs, the most robust test of the code is to  
allow any arbitrary allocation to fail.  For such code, users can  
silence particular warnings that they are 100% certain wouldn't fail  
using assertions (which would result in run-time errors if they  
actually did occur).  Writing code, however, that assumes that any  
arbitrary allocation can fail is difficult at best.

In order to be pragmatic, the analyzer assumes that an allocation has  
succeeded *unless* you check for failure (i.e., a pointer is tested  
for null).  This is exactly what happens in the test case; since "sql"  
is tested against null on every iteration of the loop, the analyzer  
assumes that the allocation could fail.  I believe that your test  
case, however, intended the "if (!sql)" to really test to see if "sql"  
had not been assigned a new value after its initial declaration.  We  
could potentially engineer the analyzer to handle such cases.  For  
example, sql is initialized with the value 0, so we can prove that the  
branch would unconditionally be taken on the first branch of the loop,  
and thus we shouldn't treat that particular branch like it was testing  
for the success of the allocation.  Such heuristics go into the realm  
of black art, which is fine if they are the right thing to do in 99%  
of the cases, but in this case it may silence more real bugs than  
intended.

So, I really see two options.  We can add a flag indicating that  
allocations are assumed not to fail, or we just keep the current  
behavior.  Assuming the checker is run using the current behavior, it  
looks like it is doing exactly the right thing.  From my perspective  
it probably makes the most sense to add an assertion after the  
allocation to "sql" since that really is the assumption you are  
making.  In this way the code itself documents your assumptions.   
Alternatively, the test case could be restructured, etc., but an  
assertion seems like an easy answer and would make your code more  
documented.

What do you think?  We're trying to make the tool as useful as  
possible.  Different people will want different operating models for  
the tool, but another benefit of the tool is that it makes one think  
about their assumptions and how their code could be improved,  
especially since some assumptions are not always correct.

Ted