[cfe-dev] Analyzer: Extract region-walking behavior from CallAndMessageChecker

Ted Kremenek kremenek at apple.com
Fri Jun 25 23:27:16 PDT 2010


On Jun 23, 2010, at 6:52 PM, Jordy Rose wrote:

> 
> On Wed, 23 Jun 2010 16:30:29 -0700, Ted Kremenek <kremenek at apple.com>
> wrote:
>> On Jun 23, 2010, at 12:05 AM, Jordy Rose wrote:
>> 
>>> CallAndMessageChecker currently has a simple check to see if any of the
>>> members of a pass-by-value struct are uninitialized. It handles nested
>>> structs, but not structs containing arrays.
>>> 
>>> I propose to extract this region-walking behavior out into a new class,
>>> RegionWalker. This would then be the basis for the current
> pass-by-value
>>> check and a possible new one (see below). My implementation of
>>> RegionWalker
>>> is loosely modeled after RecursiveASTVisitor (see attached patch), and
>>> supports both structs and arrays, and both nested.
>> 
>> Hi Jordy,
>> 
>> I think in principle the addition of RegionWalker is a nice refactoring,
>> but I'm really concerned about the following:
>> 
>> +    const ElementRegion *ER = MemMgr.getElementRegion(EleTy, IndexVal,
> R,
>> Ctx);
>> 
>> This call means that we will create a new ElementRegion for accessing
>> every single element of an array.  This is really expensive.  Not only
> will
>> this be really slow for large constant-sized arrays, it will cause a
> bunch
>> of ElementRegion objects to get generated that will stay around for the
>> entire lifetime of the GRExprEngine object.
>> 
>> As is, I don't think this can go in.
> 
> I was concerned about this too, but couldn't think of another way to
> actually visit every scalar element of a complicated nesting of structs and
> arrays. On the face of it, this could be solved with stack-based
> ElementRegion and FieldRegion objects...apart from being the only such case
> of stack-based regions, and a slightly distasteful idea. *grin*
> 
> But due to your other objections, perhaps it is not a useful finding.
> Perhaps if a pass-by-value struct had /no/ initialized members, it'd be
> worth warning...but again, there, the effort to actually walk all the
> subregions might not be worth it.
> 
> 
>> My high-level question is what is the purpose of this API?  Is to
> iterate
>> over "direct" bindings, or all the values in memory?  The StoreManager
> does
>> support an API to iterate over bindings, but the behavior will be highly
>> dependent on the implementation of the StoreManager.  We really need to
>> design these APIs with algorithmic efficiency in mind. If we need to add
>> more high-level APIs to StoreManager to make the kind of queries we want
> to
>> do more efficient than I prefer we do that.  RegionWalker right now
> defeats
>> all the laziness of RegionStore by instantiation regions on-the-fly. 
> This
>> is going to be a huge performance killer.
> 
> Huh. I had originally thought that RegionStore's subregion map was flat,
> that /any/ subregion showed up under the main one. Which wouldn't allow for
> the nice "field chain" in CallAndMessageChecker. But this isn't the case.
> 
> So perhaps I could rewrite RegionWalker to make use of this, though your
> points make it seem much less useful than I had thought.
> 
> The original motivation was to avoid duplicating code between
> CallAndMessageChecker, and this new const buffer checker idea. None of the
> other checkers yet have the same behavior. I'm feeling cautious about the
> new check as well -- there /are/ cases where a const pointer is never
> dereferenced, and so it doesn't matter if its pointee is completely
> uninitialized.
> 
> Perhaps this should be put on hold in favor of work with a clearer payoff.
> Thanks for pointing out the problems with this.

Yeah, there are a variety of issues here.  To achieve scalability with our symbolic stores, we will need to make more things implicit or lazy.  I've realized through my own experimentation that explicit walking of regions and their bindings, while conceptually simple, doesn't always mesh well with this goal.  This probably means that we need to gradually evolve the APIs for querying the store to be more declarative, and then let the store figure out the most efficient way to execute that query based on its own internal implementation.

I'm mixed enough about the pass-by-value field analysis in CallAndMessageChecker that I'm steadily getting more inclined to remove it.  I don't think that particular feature of the check has found any bugs that I'm aware of, but I have seen plenty of false positives.

Thanks for thinking about this though, and please keep your code on the back burner.  We may decide to still use, albeit in possibly a different form.



More information about the cfe-dev mailing list