[clang] [analyzer] Avoid use of `CallEvent`s with obsolete state (PR #160707)

Artem Dergachev via cfe-commits cfe-commits at lists.llvm.org
Sat Oct 4 16:50:34 PDT 2025


haoNoQ wrote:

> (Tangentially related remark: it's a bit annoying that `ProgramState::getSVal` has overloads that get SVals from very different sources -- reading code that uses them would be easier if they had different names.)

Yes I completely agree 😅 The dream of "hey just give me, like, some value" hasn't quite materialized. It's way more complicated than that.

> `getState()->getStateManager().getContext()`

This is indeed a bit silly but legal. They could probably also do a `LocationContext->getAnalysisDeclContext()->getASTContext()` instead. You can probably give them a direct accessor method for `ASTContext`.

> The method `CXXInstanceCall::getDeclForDynamicType()` calls `getDynamicTypeInfo(getState(), R)` where `R = getCXXThisVal().getAsRegion()`. This seems to be a suspicious call where an obsolete state could perhaps cause problems (although I'm not sure about this -- I just presume that the dynamic type info can change dynamically). If needed, we can modify this method to take the State as an explicit parameter.

Yes, the dynamic type can change over time. It usually acts as a constraint, i.e. "what we know about the runtime type behind the pointer". Our knowledge can improve over time but it can't evolve in contradictory ways. (At least not until the program performs a placement-new or something of that nature.)

So this is one of the payoffs for tracking dynamic types. It helps us resolve virtual calls as part of `getRuntimeDefinition()`. The thing about `getRuntimeDefinition()` is that it's a complex piece of imperative code that arguably needs to be invoked exactly once per function call, by the engine, and then the value should probably be simply stored somewhere and communicated through other means. Eg., the chosen `LocationContext` is a good way to keep track of it.

Given that it'd be invoked exactly once and very carefully, the question of multiple out-of-sync states hopefully doesn't come up. Maybe it should be moved out of `CallEvent` to somewhere else, have the virtual part of it minimized somehow, so that it only extracted the relevant bits and pieces of information from the sub-class but didn't try to load stuff from the Store? And then the rest of the implementation could use `CallEvent` through its public interface?

> Similar use of dynamic type info also happens in `ObjCMethodCall::getRuntimeDefinition()`

This one's a bit worse because ObjC `self`, unlike `this`, is an actual variable. It can be, and often is, overwritten in the middle of the method. Like for example in a constructor they often reassign `self` to the return value of the manually-invoked superclass constructor, and then they check `self` for null to see if the superclass constructor failed. If it failed, they sometimes try to construct a completely different object and put it back into `self` and then proceed with the subclass constructor normally. So in this case there are two points of failure: the contents of the variable `self` and the dynamic type information may both mutate.

But the rough idea is probably still the same: it should be a one-off thing that lives its own life somewhere in the engine, outside of our little utility class.

> The method `CallEvent::invalidateRegions(unsigned BlockCount, ProgramStateRef Orig)` starts with `Result = (Orig ? Orig : getState())`

Same story as `getRuntimeDefinition()` imho. I think this one needs to be invoked exactly once per function call. It's virtual because it picks up additional regions and/or per-region invalidation modes from subclasses. Maybe that's the only thing `CallEvent` should be doing, and the rest should be done by someone else.

The slightly annoying part with this one is that it also needs to be invoked by some of the `evalCall` checkers when they're giving up on modeling the call precisely and need to model it conservatively instead. Maybe they should be given a different API that does exactly that and comes with a guarantee that it's exactly equivalent to conservative evaluation.

> The method `ObjCMethodCall::getExtraInvalidatedValues` calls `getState()->getLValue(PropIvar, getReceiverSVal())` which looks up a value from the region store IIUC.

Yeah `getReceiverSVal()`/`getSelfSVal()` have the same problem as the other ObjC bullet point: the `self` value may change over time.  The `getLValue()` method is of course non-rotting. But yeah, this is just part of the `invalidateRegions()` thing.

https://github.com/llvm/llvm-project/pull/160707


More information about the cfe-commits mailing list