[clang] [analyzer] Limit Store by region-store-binding-limit (PR #127602)

DonĂ¡t Nagy via cfe-commits cfe-commits at lists.llvm.org
Tue Feb 18 08:29:31 PST 2025


https://github.com/NagyDonat commented:

I'm really happy that you managed to track down and fix these slow corner cases, and overall I like the code changes, but I there were a few places where the code was difficult to understand for me (although this may be also related to the fact that I didn't sleep enough last night... :wink:).

I added a few inline comments to mark some confusing parts. Moreover, I also felt that it would be possible to implement the exhaustion "event" more elegantly with a different approach -- and I'll discuss in the rest of this "main" review comment.

If I understand correctly, your approach does the following steps:
1. The constructor of `RegionManagerStore` (usually) initializes `RegionStoreMaxBindingFanOut`  to `getOptions().RegionStoreMaxBindingFanOut + 1` -- I think this off-by-one difference between identically named variables is a serious drawback of your approach.
2. `RegionManagerStore::getRegionBindings()` uses this adjusted fan-out value to construct the `RegionBindingsRef` object and initialize its field `BindingsLeft`.
3. `BindingsLeft` is decremented by the `RegionBindingsRef::add` method (which adds _a cluster_).
   - if  `BindingsLeft` is already zero, it's not decremented (so it won't overflow)
     - is this branch actually used?
     - if not, then it should be replaced by an assertion
   - there is an exceptional case where `removeSubRegionBindings` uses the 3-argument form of the `add` method to avoid decrementing the `BindingsLeft` counter.
4. If `BindingsLeft == 1`, then `RegionBindingsRef::addBinding` hijacks the call and adds a default binding instead of the one that's specified by its argument.
5. If `BindingsLeft == 0`, then (as far as I see) early return branches with `if (B.hasExhaustedBindingLimit())` stop most (all?) code paths that would add bindings or clusters.
6. On these early return branches `escapeValue()` or `escapeValues()` calls ensure that the values that we ignored are marked as escaped.

 In this model there are three phases of the binding creation process:
 1. `BindingsLeft > 1`, bindings are created normally
 2. `BindingsLeft == 1`, intermediate state:
     - `add` works, can add clusters and can decrement `BindingsLeft` to zero
       - if `BindingsLeft` is decremented to zero here, then I don't think that the "add a default binding" step happens
     - `addBindings` adds a default binding (which decrements `BindingsLeft`) and records that the `SVal` specified in its argument has escaped 
 3. `BindingsLeft == 0`, exhausted state:
     - `hasExhaustedBindingLimit()` early returns prevent most (all?) addition of bindings
     -  but e.g. the three-argument `add` within `removeSubRegionBindings` still works
     - and perhaps there are some other code paths...
(I spent hours on trying to understand this, but I'm still not completely confident that I got every corner case right.) I didn't find anything that is outright buggy, but sometimes I felt that the code tries to preserve an invariant (e.g. "default binding is always added just before exhaustion"), but there is another branch where I don't see why is it preserved.

To simplify the picture a bit, I'd suggest the following alternative approach:
1. The constructor of `RegionManagerStore` always initializes `RegionStoreMaxBindingFanOut`  to `getOptions().RegionStoreMaxBindingFanOut` -- to avoid discrepancies and confusion between them.
2. If this fan-out value is nonzero, then `RegionManagerStore::getRegionBindings()` uses it directly to initialize `RegionBindingsRef::BindingsLeft` (i.e. it does not add one).
    - If the fan-out value is zero (which means unlimited), `BindingsLeft` may be initialized to `-1u`.
    - Alternatively, perhaps it would be more modern (but slightly more verbose) to say that `BindingsLeft` is an `std::optional<unsigned>` and it's initialized to `std:nullopt` when there is no limit (i.e. when using `getRegionBindingsWithUnboundedLimit` or when `RegionStoreMaxBindingFanOut == 0`).
3. **(unchanged)** `BindingsLeft` is decremented by the `RegionBindingsRef::add` method (which adds _a cluster_).
   - if  `BindingsLeft` is already zero, it's not decremented (so it won't overflow)
     - is this branch actually used? -- if not, then it should be replaced by an assertion
   - there is an exceptional case where `removeSubRegionBindings` uses the 3-argument form of the `add` method to avoid decrementing the `BindingsLeft` counter.
4. _There is no highjacking logic and `BindingsLeft == 1` has no special meaning._
5. **(unchanged)** If `BindingsLeft == 0`, then `B.hasExhaustedBindingLimit()` is true, which triggers early return on most (all?) code paths that would add bindings or clusters.
6. `escapeValue` and `escapeValues` are replaced by more general functions that
    - still mark the "leftover" values as escaped
    - but also add a default binding (with unknown value) to mark that the `RecordBindingsRef` stores incomplete information.

WDYT about this alternative architecture?

https://github.com/llvm/llvm-project/pull/127602


More information about the cfe-commits mailing list