[cfe-commits] [Patch]LocationContext Support for CheckedReturn

Ted Kremenek kremenek at apple.com
Thu Apr 14 23:17:41 PDT 2011


Hi Lei,

Just to follow up, I know this conversation has been a bit roundabout.  I'm trying to get a clear design for these kind of checkers, and I'm gradually understanding the requirements here.

I'm a bit nervous about caching data in the checker itself, since we've recently been changing the lifetime of a Checker object.  Previously the lifetime of a Checker object was bound to the lifetime of an ExprEngine, but now Checkers are managed by CheckerManager, and can conceptually be used by multiple ExprEngines.

Keeping data in LocationContexts is a both hit and miss because they were intended to model context-sensitivity.  I can completely buy the argument that checkers may wish to record information in LocationContexts that are, well, tied to a given location context.  Your statistical checker, however, will be making assumptions about the LocationContext that doesn't seem true in all cases.

I'm also a bit skeptical that all statistical checks would run in a strict "textual" (or one-function-at-a-time, no inling) versus the other extreme of whole program IPA.  It seems like different statistical checkers will want the freedom to collect, and possibly merge, different kinds of statistics, possibly from different LocationContexts.  That said, how statistics are collected is strictly a policy decision of the individual checker, but I'd be much happier that a statistical checker was fine with working in IPA mode as well; it just was smart about what statistics to collect.

In a nutshell, I see three kinds of data the checker may wish to collect:

1) "global" data, that spans the entirety of all code analyzed by a Checker instance

2) "location context" specific data, that spans the analysis of a given *call* to a function

3) per-path/per-state data, which is stored in the GDM of a GRState

I think your statistical checker will likely one to use (1) or (2).  You outlined the restriction of using (2) and assuming that a single LocationContext would be made for each analyzed function.  I think that's a fairly brittle restriction that's likely to bite us later.  It seems to me that if a Checker wants to keep statistics for just a given function (regardless of LocationContexts), then that data isn't context-sensitive, and should be stored in the Checker instance.  Essentially, the Checker object would have a DenseMap mapping from FunctionDecls to a statistics map.  If a Checker saw that it already had statistics for a given FunctionDecl, it could either collect/merge additional statistics, or decide not to collect any additional statistics.  This would allow a statistical checker to work fine in the case of multiple LocationContexts per FunctionDecl.

What do you think?

Ted

On Apr 13, 2011, at 7:39 PM, Ted Kremenek wrote:

> Hi Lei,
> 
> I think there are multiple possible approaches here.
> 
> First, I think it's reasonable for us to add a "Generic Data Map" mechanism, similar to what we have in GRState, for keeping checker-specific data in LocationContexts.  It needs to be a general mechanism, however, so that the LocationContext API doesn't have anything relating to a specific checker; just a generic way to hold data.  That data also needs to be deallocated once the LocationContext object is freed.
> 
> That said, a simpler mechanism may just be to keep a DenseMap in the statistical checker from LocationContexts -> statistics, and then notify a checker when the LocationContext object gets destroyed.  Actually, we can possibly implement the Generic Data Map mechanism in this way.
> 
> Concerning "inlining", what you call "textual" analysis versus WPA mode reduces down to us having a "policy" object that dictates when we inline and when we don't.  With a policy object, we could change when and how IPA is done, and reuse the same analysis engine.  Different checkers could then articulate that they require a specific IPA policy in place (in the case of this checker, no IPA).  This idea is only half-formed, but I think there are multiple places in the analyzer engine where various "policies" could be used to help guide how analysis is done (e.g., how are loops handled, etc.).
> 
> If your checker truly won't work in IPA, then it is probably okay to just use a DenseMap in the Checker itself, and just invalidate that DenseMap when we start analyzing a function.  That seems like the least complicated solution, although it prohibits a single instance of the Checker from being used simultaneously by multiple ExprEngines.  After thinking about this some, I think this is a reasonable approach, just as long as you understand the restrictions here and that the statistics really will only work when analyzing a single function at a time.
> 
> On Apr 11, 2011, at 11:10 PM, 章磊 wrote:
> 
>> Hi Ted, 
>>  
>> I think statistical checker is some different from the checkers we have. 
>> 
>> IMO, when doing statistical check, the static analyzer should not work in the "whole-program-analysis" mode, but in the "textual" statistic mode. 
>> 
>> In this mode, we treat every top level functiondecl (also the inline function) has a body as a cell. Then we do symbolic execution in every cell to check CheckedReturn state for calls (i.e, a call to foo()), and textually count a function decl's (i.e, foo()) CheckedReturn state from all the occurrences (calls). 
>> 
>> So we need a new action to activate this "textual" mode, and in this mode the inline IPA should be deactivated (i think the function summary IPA also works). Then every cell will have only one LocationContext, right? 
>>  
>> Here is a example to make it clearer. 
>>  
>> int f1(); 
>>  
>> int main() { 
>>   f2(); 
>>   f3(); 
>> } 
>>  
>> void f2() { 
>>   f1(); 
>> } 
>>  
>> void f3() { 
>>   f2(); 
>>   if (f1()) 
>>     return; 
>> } 
>>  
>> There are two "textual" calls to f1(), the one in f3() is checked and the one in f2() is unchecked. But if we do the check in WPA Mode(IMO, as viewed from "execute"), there are 3 calls to f1(), one of them is checked and the other two are not. 
>>  
>> We should do it in the "textual" way, right?
>> 
>> What you say about this?
>> 
>> 2011/4/12 Ted Kremenek <kremenek at apple.com>
>> Lei,
>> 
>> Just to make this a little clearer, suppose we had:
>> 
>> unsigned factorial(unsigned n) {
>>  if (n == 0)
>>   return 1;
>>  return n * factorial(n - 1);
>> }
>> 
>> Currently when we analyze this function, we create only one LocationContext.  However, one way to do IPA is to due inlining of function calls.  In this case, we might create a second LocationContext during the first recursive call to factorial.  If we wanted to further inline, we might create a second LocationContext, etc., each one capturing the "context" of the recursive call and simulating an abstract stack frame (obviously bounding the amount of LocationContexts we create).
>> 
>> On Apr 11, 2011, at 10:53 AM, Ted Kremenek wrote:
>> 
>> > I thought about this some more, and using the LocationContext might be a reasonable place to put checker-specific data that represents "global" information (i.e not specific to a GRState) but also limited in scope (i.e. the data is limited to the scope of analyzing a given *call* to a function).  Once we support analysis inlining, it will be possible for multiple LocationContext objects to be around at the once for the same function.  Are the statistics you are interested in specific to a given LocationContext?
>> >
>> > On Apr 11, 2011, at 10:19 AM, Ted Kremenek <kremenek at apple.com> wrote:
>> >
>> >> Hi Lei,
>> >>
>> >> LocationContext should not contain any checker-specific data.  It is only intended to model context-sensitivity (i.e., it simulates an abstract stack frame, or "location" in an abstract call chain).
>> >>
>> >> What are you trying to do?
>> >>
>> >> Ted
>> >>
>> >> On Apr 10, 2011, at 8:26 PM, 章磊 wrote:
>> >>
>> >>> Hi Clang,
>> >>>
>> >>> This patch add a DenseMap to record the CheckedReturn count for a functiondecl(fielddecl, vardecl as function pointer) in LocationContext.
>> >>>
>> >>> ImmutableMap seems not alright here, is DenseMap ok?
>> >>>
>> >>> And how to make it checker-specific?
>> >>>
>> >>> This patch is preparation for statistical UncheckedRenturn checker.
>> >>> I'll appreciate it if there are any advice about this patch.
>> >>>
>> >>> --
>> >>> Best regards!
>> >>>
>> >>> Lei Zhang
>> >>> <CRResultMap.patch>_______________________________________________
>> >>> cfe-commits mailing list
>> >>> cfe-commits at cs.uiuc.edu
>> >>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>> >>
>> >>
>> >> _______________________________________________
>> >> cfe-commits mailing list
>> >> cfe-commits at cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>> 
>> 
>> 
>> 
>> -- 
>> Best regards!
>> 
>> Lei Zhang
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110414/264455ca/attachment.html>


More information about the cfe-commits mailing list