[cfe-dev] Static Analyzer Rocks Hard

Thu Jun 19 22:36:37 PDT 2008

On Jun 19, 2008, at 4:54 PM, Ted Kremenek wrote:

>
> On Jun 19, 2008, at 1:03 PM, Matthew Jimenez wrote:
>>
>> Yes, this was what I was thinking, however I am unsure how I feel  
>> about it.
>> Obviously any case where the allocation or initialization fail here  
>> would cause the code to misbehave, and this is something I had not  
>> taken into full consideration.
>> Asserting the values after allocating them causes the analyzer to  
>> not report any problems, but that might be overkill.
>
> Assertions are funny things.

I think that's my new favorite quote ;-)

> If you know that the allocation could never fail then the assertion  
> properly silences the warning.  If the allocation *could* fail, it  
> merely turns an error that could be caught with static analysis into  
> a run-time error (i.e., the assertion fires, causing the program to  
> halt).
>
>> Perhaps what I'm more interested in are flags for the analyzer for  
>> assumptions it may be allowed to make.
>> Normal behavior assuming allocation is always working and a flag to  
>> assume allocation could fail (or perhaps the reverse).
>> Although, too many such flags might also be quite bad.
>
> Most code is not engineered to handle the case when memory  
> allocation fails.  When memory allocation starts to fail, programs  
> tend to do weird things since they don't know how to recover.  For  
> such programs, flagging leaks (or any other kind of bug) that occur  
> when memory allocation fails are likely going to be deemed as false  
> positives.
>
> That said, there is code out there that does care when allocations  
> fail.  Such code should be designed to handle and recover from  
> failure.  For such programs, the most robust test of the code is to  
> allow any arbitrary allocation to fail.  For such code, users can  
> silence particular warnings that they are 100% certain wouldn't fail  
> using assertions (which would result in run-time errors if they  
> actually did occur).  Writing code, however, that assumes that any  
> arbitrary allocation can fail is difficult at best.
>
> In order to be pragmatic, the analyzer assumes that an allocation  
> has succeeded *unless* you check for failure (i.e., a pointer is  
> tested for null).  This is exactly what happens in the test case;  
> since "sql" is tested against null on every iteration of the loop,  
> the analyzer assumes that the allocation could fail.

This behavior does make sense, but should it really extend past the  
first usage of the of the variable or after the first pass of a loop?  
There may be cases where extending it might be a good idea, but I'd  
imagine those cases too be very rare in code.

> I believe that your test case, however, intended the "if (!sql)" to  
> really test to see if "sql" had not been assigned a new value after  
> its initial declaration.  We could potentially engineer the analyzer  
> to handle such cases.  For example, sql is initialized with the  
> value 0, so we can prove that the branch would unconditionally be  
> taken on the first branch of the loop, and thus we shouldn't treat  
> that particular branch like it was testing for the success of the  
> allocation.  Such heuristics go into the realm of black art, which  
> is fine if they are the right thing to do in 99% of the cases, but  
> in this case it may silence more real bugs than intended.

I'm inclined to agree with the above - it is tricky and probably not  
needed at this point.
At the moment, I can't think of additional real bugs that could be  
silenced, but I'm probably being too short-sighted.

> So, I really see two options.  We can add a flag indicating that  
> allocations are assumed not to fail, or we just keep the current  
> behavior.

After reading the description of the behavior above, I'm thinking a  
flag would not be a good idea.

> Assuming the checker is run using the current behavior, it looks  
> like it is doing exactly the right thing.  From my perspective it  
> probably makes the most sense to add an assertion after the  
> allocation to "sql" since that really is the assumption you are  
> making.  In this way the code itself documents your assumptions.   
> Alternatively, the test case could be restructured, etc., but an  
> assertion seems like an easy answer and would make your code more  
> documented.

Adding the asserts would probably be the best case in the projects I  
work on, but there are going to be many cases of this, and the  
original case was a little confusing at first glance.

> What do you think?  We're trying to make the tool as useful as  
> possible.  Different people will want different operating models for  
> the tool, but another benefit of the tool is that it makes one think  
> about their assumptions and how their code could be improved,  
> especially since some assumptions are not always correct.

It's definitely made me think about the way I code, but it would make  
me think the best option would be to always assert against all  
allocations - which would be madness in a large pre-existing code  
base :-)
Seriously though, my case may still be rare enough outside of the code  
I work on that it could be successfully ignored. I'll gladly ignore  
the bug if it comes to that. I'm already ignoring some cases of dead  
stores and memory leaks that are in code that was not relevant  
(obvious because the original author named the function  
"never_called_but_needed"), and I'm not aiming for complete  
coverage... at least not yet.

Thanks,
-Matthew