[clang] [clang-tools-extra] RFC: [clang-tidy] [analyzer] Nondeterministic pointer usage improvements (PR #110471)

Mon Sep 30 06:48:00 PDT 2024

https://github.com/steakhal commented:

I wish we had something like this. It might be possible, but really really challenging to implement.
To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:
```c++
int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}
```
But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:
```c++
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}
```

Only looking at the for loop here is not enough to decide this.

---

Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like `is_sorted` or `partition` is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that `begin` might be an ADL free-function, and not a memer function, such as for `int *arr[10]`. C++ also has `std::ranges::*`, which we may wanna cover too in the future.

---

You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

https://github.com/llvm/llvm-project/pull/110471