[clang] [analyzer] Widen loops: Augment assignment note when values invalidated at the start of a loop (PR #122398)

Sat Jan 11 02:47:10 PST 2025

steakhal wrote:

> I agree that it would make more sense to generalize this beyond loop widening. However, I'm having a hard time understanding how your proposal would fit in to the augmentation of the assignment message like I do in this PR.
> 
> The part I'm most confused about is how would `SymbolInvalidationArtifact` be set/retrieved after invalidation occurs? Would this somehow be tied to symbols created as part of invalidation (like created conjured symbols in `InvalidateRegionsWorker::VisitCluster`) which could then be retrieved via `SI.Value.getAsSymbol()` and casting (speaking from the PoV of `showBRDiagnostics`)? I think I would better understand if you could please provide a small example in the context of diagnostic output. I'm fairly certain I'm lacking understanding of some static analysis design concept that's not making this click in my head...
> 
> Regardless, once I have a better understanding of how your proposal fits in, I'd be happy to give your approach a try with your guidance :).

Let me explain.
So far, in the visitor you want to know if the value of something *before* and *after* some invalidation event. In your case you are interested in loop-widening invalidations.
So, once you have all these, you formulate a decision if you should add an extra note in the BRVisitor.

This is nice and all, but there are a couple of problems with it:
 - Tightly couples the specific invalidation event (loop widening) because you had to add a special tag for the node. For other kinds of invalidations we may not have a tag, such as manual invalidations using `ProgramState::invalidateRegions`. This makes piggibacking on an ProgramPoint tag a less elegant solution.
 - You need to do the digging yourself for the values before and after the invalidation.

What I proposed in the linked RFC was to have SymExpr kind for representing the value of a symbol after some invalidation. It would have everything inside what we might need. The value before and after the invalidation (possibly even the states themselves), the kind of the invalidation, such as loop-widening or anything else.

In the visitor, it would simplify your code to:
```c++
if (const auto *Invalidation = dyn_cast_or_null<InvalidationArtifact>(SI.Value.getAsSymbol())) {
  if (isa<LoopWidening>(Invalidation->getCause())) {
    OS << " (loop-widening invalidated this symbol here)";
  }
}
```

To be transparent, what you do makes sense. Given that this would only affect configs explicitly enabling the experimental loop widening, I'm also okay with some less elegant solutions of course.

https://github.com/llvm/llvm-project/pull/122398