[PATCH] D153059: [-Wunsafe-buffer-usage] Group parameter fix-its

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 20 18:27:45 PDT 2023


NoQ added inline comments.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2077
+
   for (const auto &[VD, Fixables] : FixablesForUnsafeVars.byVar) {
     FixItsForVariable[VD] =
----------------
There's a bug in variable naming: `FixablesForUnsafeVars`actually contains fixables for *all* variables. Rashmi correctly renames it in D150489 to reflect its actual behavior. So IIUC you're generating fixits for all variables here, which is probably not what you want.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2278-2281
+          if (isParameterOf(Impl.first, D))
+            ParmsNeedFix.insert(Impl.first);
+          if (isParameterOf(Impl.second, D))
+            ParmsNeedFix.insert(Impl.second);
----------------
Does this really go in both directions? Shouldn't it be just one direction?

```lang=c++
void foo(int *p) {
  int *q = new int[10];
  p = q;
  q[5] = 7;
}
```
In this case `q` is an unsafe local buffer, but `p` is a (mostly) safe single-object pointer that doesn't need fixing, despite implication from `q` to `p`. Of course this example is somewhat unrealistic (people don't usually overwrite their parameters), but it is also the only situation where the other direction matters.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2293
+    //      search of connected components.
+    if (!ParmsNeedFix.empty()) {
+      auto First = ParmsNeedFix.begin(), Last = First;
----------------
What about the situation where params aren't seen as unsafe yet, but they're discovered to be unsafe later? Eg.
```
void foo(int *p, int *q) {
  int *p2 = p;
  p2[5] = 7;
  int *q2 = q;
  q2[5] = 7;
}
```
Will we notice that `p` and `q` need to be fixed simultaneously?

---

I suspect that the problem of parameter grouping can't be solved by pre-populating strategy implications between all parameters. It'll either cause you to implicate all parameters (even the ones that will never need fixing), or cause you to miss connections between parameters that do need fixing (as the connection is discovered later).

Connection between parameters needs to be added *after* the implications algorithm has finished. And once it's there (might be already there if I missed something), then this part of code probably becomes unnecessary.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2374-2376
   FixItsForVariableGroup =
       getFixIts(FixablesForAllVars, NaiveStrategy, D->getASTContext(), D,
                 Tracker, Handler, VariableGroupsMap);
----------------
Note: This is why `FixablesForUnsafeVars` is actually `FixablesForAllVars`.


================
Comment at: clang/test/SemaCXX/warn-unsafe-buffer-usage-fixits-multi-parm-span.cpp:13-15
+   expected-note{{change type of 'p' to 'std::span' to preserve bounds information, and change 'q' and 'a' to 'std::span' to propagate bounds information between them}} \
+   expected-note{{change type of 'q' to 'std::span' to preserve bounds information, and change 'p' and 'a' to 'std::span' to propagate bounds information between them}} \
+   expected-note{{change type of 'a' to 'std::span' to preserve bounds information, and change 'p' and 'q' to 'std::span' to propagate bounds information between them}}
----------------
QoL thing: //"to propagate bounds information between them"// isn't the correct reason in this case. Bounds information doesn't propagate between `p` and `q` and `a`. Each of them carries its own, completely unrelated bounds information.

We need to come up with a different reason. Or it might be better to combine the warnings into one warning here, so that we didn't need a reason:
```
warning: 'p', 'q' and 'a' are unsafe pointers used for buffer access
note: change type of 'p', 'q' and 'a' to 'std::span' to preserve bounds information
```

This gets a bit worse when parameters are implicated indirectly, but it still mostly works fine, for example:
```
void foo(int *p, int *q) {
  int *p2 = p;
  p2[5] = 7;
  int *q2 = q;
  q2[5] = 7;
}
```
would produce:
```
warning: 'p2' and 'q2' are unsafe pointers used for buffer access
note: change type of 'p2' and 'q2' to 'std::span' to preserve bounds information, and
      change 'p' and 'q' to 'std::span' to propagate bounds information between them
```

---

In any case, this shows that the relationship between parameters (grouped together simply for being parameters) is very different from the relationship within groups of variables implicated by assignments (even if two or more of them are parameters). All the way down to the warning/note printer, we probably need to continue giving the parameter relationship a special treatment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153059/new/

https://reviews.llvm.org/D153059



More information about the cfe-commits mailing list