[clang] [analyzer] Avoid use of `CallEvent`s with obsolete state (PR #160707)

Wed Oct 1 06:35:13 PDT 2025

NagyDonat wrote:

Thanks for all the valuable feedback!

> We should separate out `State` from `CallEvent` or have some layer that would ensure that the State attached with a Call is always up to date.

@steakhal In an ideal world I agree with this, but I fear that both suggestions have difficulties:
- There are lots of methods that rely on the `State` within the `CallEvent` – even seemingly innocent methods like `SimpleFunctionCall::getDecl()` can depend on the state (when a function is called through a pointer). Before checking this I thought that it would be nice to separate out `State` from the `CallEvent`, but based on this I fear that this wouldn't be practical.
- Ensuring that the Call always refers to the most recent state is practically impossible, because the "most recent state" is only tracked informally: the developer knows which state variable is the most recent, but this is not represented in a machine-readable way. (In some functional languages there are so called "uniqueness types" or the [`State` monad](https://hackage.haskell.org/package/mtl-2.3.1/docs/Control-Monad-State-Lazy.html) which can express invariants like "always use the most recent state" but C++ doesn't provide a suitable abstraction for this goal.)

Based on this right now I feel that the best approach would be gradually improving the situation by smaller changes.

> > I will create a test with a new debug checker which modifies the state in its `PreCall` callback and validates that the modification is visible through the `CallEvent` objects received by the `EvalCall` and `PointerEscape` callbacks. Is this sufficient, or would you prefer a different kind of test?
> 
> I think the best way to go about this is a unittest with a custom bug report visitor that prints a special trait along the path. Like a logger. And the visitor would print these values alongside the ProgramPoint it is attached to.

I don't understand how could a bug reporter visitor help in this situation. The bug (which is fixed by this PR) is that checker callbacks receive `CallEvent` objects whose attached state is old (and differs from the state available through the `CheckerContext` which is also passed to the same callback). Testing this requires two  components:
- Checker callbacks that validate that the `CallEvent` and the `CheckerContext` have (references to) the same state.
- Checker callbacks that modify the state (at suitable points e.g. the PreCall callback) to ensure that states that _may_ be different are actually different. It wouldn't be too difficult to find "real" checkers that cause suitable state changes, but to make the test simple and self-contained I'm leaning towards a "synthetic" state change from a debug checker (the same one that also does the validation).
A bug report visitor doesn't help with either of these components, because it cannot observe the state attached to the `CallEvent` (it is just passed to the checker callback, but not persisted in the `ExplodedGraph` for later use by e.g. visitors) and it (obviously) cannot update the state.  

> From this perspective, yeah, it may be slightly useful to keep the original pre-call state around. Say, if you're modeling a function `void foo(int *arg)` that effectively does something like `if (*arg == 0) *arg = 1;`, and your checker callback is sufficiently complex, then at some point you may need to ask "Uhh can you please remind me what the original value was?" - and then `CallEvent` comes in helpful. But I don't think this minor benefit outweighs the confusion caused by calling that very specific program state "the state". I think it's much easier to handle that by deliberately remembering the original state in a local variable, and then pass that state around. It's probably not a popular situation in the first place.

I didn't even think about this usecase of the "archived" `State` attached to the call event. I agree that this could be useful within checker code, and I also agree that this is probably a rare situation and should be handled by explicitly saving the original state in a local variable. (Note: this "state attached to call becomes obsolete within the checker" situation is distinct from the "checker callback receives a CallEvent with a state that is already obsolete at the beginning of the call" bug which is fixed by this PR.)

> First, yeah, like you said, we could have a static or dynamical typestate that indicates whether `getReturnValue()` makes sense. (It only ever makes sense in `checkPostCall`.)

I support this idea -- writing a dynamic check would be trivial, and even a static check is not too difficult.

> Then, it might be a good idea to simply make `CallEvent.getState()` _private_. It wasn't our goal to provide convenient access to the entire state. We just wanted a nice `getArgSVal()` and such. Maybe let's provide only the actually-useful, non-confusing methods?

Making `CallEvent.getState()` private would be a very easy "harm prevention" change that would reduce the damage potential of this footgun. A quick search reveals that outside of `CallEvent.{cpp,h}` this method is only used twice, and in both of those locations it would be very simple to get the state from the readily available `CheckerContext` instead. (This is very fortunate -- I would have guessed that there is more of this...) In fact I'll soon create a trivial separate PR for this proposal, because it's a trivial change and I don't see any downside.

Unfortunately this is just "sweeping the problems under the rug" and not a complete solution -- even if we cannot directly access the obsolete old state object, the analyzer may still run into logically wrong behavior if methods like `getArgSVal` produce obsolete results from the old state. There are probably situations where the old state is not _too_ old and the values that are looked up from it happen to be correct (e.g. a PreCall callback can't rebind the value of an expression in the `Environment`), but reasoning about the partial correctness of old states is not a safe long-term solution.

> Alternatively, it may be a good idea to implement these methods in `CheckerContext` itself, and then stop showing `CallEvent` to the user. So instead of `CallEvent.getArgSVal()` you'd need to type `CheckerContext.getCallArgSVal()`. Unfortunately this disconnects the user from the entire polymorphism of the `CallEvent` class. There's actually a lot of methods that would need to be reimplemented (like `AnyCXXConstructorCall.getCXXThisVal()` or `CXXAllocatorCall.getArraySizeVal()) and most of them wouldn't make sense most of the time.

WDYT about a change that keeps the `CallEvent` type hierarchy but ensures that `CallEvent`s can only be constructed by the `CheckerContext`? I can envision a situation where:
- There are "empty"/"stateless"/"template" `CallEvent` objects that are passed around in the engine e.g. in `ExprEngine::evalCall`.
- The constructor of `CheckerContext` can (optionally) own a `CallEvent` (of this empty stateless kind).
- Checkers can use a new method called `CheckerContext::getCallEvent()` which combines the stateless call event with the state of the `CheckerContext` to return a "regular" `CallEvent`.

> I think the assertion I propose is the easiest way to test this patch. I.e., the fact that the newly added assertion used to crash without the rest of the patch but doesn't crash anymore, is a sufficient defense against regressions in and of itself.

@haoNoQ Are you speaking about the assertion to validate that `CheckerContext.getState() == CallEvent.getState()`? I agree that this assertion enforces an invariant that should be held, but unfortunately I don't see a "natural place" for it.
- Placing this on the engine side of the engine/checker boundary (just before calling the callback) seems natural, but that's very close to the location where the `CheckerContext` and the (updated) `CallEvent` are created -- so we would just write a tautologically passing check that almost looks like `foo = bar; assert(foo == bar);`.
- Placing this assertion into every checker callback (that takes a `CallEvent`) would be too much "pollution" IMO.
- Finally, we could place this assertion in a debug checker that is designed for this purpose and activated in a test that ensures that its callbacks are triggered (with frequent state changes to reveal the use of obsolete states). This is essentially equivalent to the testing approach that I suggested.

https://github.com/llvm/llvm-project/pull/160707