[PATCH] D139737: [-Wunsafe-buffer-usage] Initiate Fix-it generation for local variable declarations

Artem Dergachev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 24 16:07:45 PST 2023


NoQ added a comment.

Awesome, we're getting closer to producing our first fix!

I think the most important thing right now is to add a lot of `SourceLocation::isMacroID()` checks everywhere, so that to *never* try to fix code that's fully or partially expanded from macros. I suspect that we can do that "after the fact": just scan the list of emitted fixits for any source ranges that start or end in macros, and if even one such range is found, abandon the fix. Might be worth double-checking that the compiler doesn't already do that automatically for all emitted fixits.

I see that the patch doesn't touch `handleFixableVariable()`. Do we still attach fixits to the warning, or did we already change it to be attached to a note associated with the warning? Because fixits with placeholders aren't allowed on warnings, we should make sure it's attached to note before landing this patch.



================
Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:42
+  /// Returns the text indicating that the user needs to provide input there:
+  static std::string
+  getUserFillPlaceHolder(const StringRef &HintTextToUser = "placeholder") {
----------------
Do we really want this method in the public interface? If the idea is to let people the user override it, maybe let's actually allow that by dropping `static` and adding `virtual`? I think this is actually a good idea, maybe if somebody uses our analysis outside of the `clang` binary, they may want a different placeholder.


================
Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:43
+  static std::string
+  getUserFillPlaceHolder(const StringRef &HintTextToUser = "placeholder") {
+    std::string s = std::string("<# ");
----------------
`const StringRef &` seems redundant given that `StringRef` is already a "Ref". Just pass by value.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:408
+    auto BaseIsArrayOrPtrDRE =
+        hasBase(ignoringParenImpCasts(allOf(declRefExpr(), ArrayOrPtr)));
+    auto Target =
----------------
A bit more concise.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:637
 
-static Strategy
-getNaiveStrategy(const llvm::SmallVectorImpl<const VarDecl *> &UnsafeVars) {
----------------
Hmm, did this need to be moved? I don't think you're calling this function from the new code.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:726
+template <typename NodeTy>
+static std::optional<SourceLocation> getPassedLoc(const NodeTy *Node,
+                                                  const SourceManager &SM,
----------------
(https://www.oxfordinternationalenglish.com/passed-vs-past-whats-the-difference/, looks like "Passed` is always a verb)


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:768
+    if (const Expr *Ext = CxxNew->getArraySize().value_or(nullptr)) {
+      if (!Ext->HasSideEffects(Ctx))
+        ExtentText =
----------------
Excellent observation!


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:771
+            getExprTextOr(Ext, SM, LangOpts, StringRef(DefaultExtentTxt));
+    } else if (!CxxNew->isArray())
+      ExtentText = One;
----------------
I wonder how do we end up in such situation. If it's not array new, this means it's also not a buffer of multiple elements. But I guess the pointer can be reassigned later. Yeah I think you're right, this is the correct solution. Maybe let's leave a comment explaining how this could happen?


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:773-774
+      ExtentText = One;
+  } else if (auto CArrTy = dyn_cast<ConstantArrayType>(
+                 Init->IgnoreImpCasts()->getType().getTypePtr())) {
+    // In cases `Init` is of an array type after stripping off implicit casts,
----------------
That's another case where `.getTypePtr()` is unnecessary: `dyn_cast<T>(QT.getTypePtr())` is equivalent to`QT->getAs<T>()` where `->` is `QualType`'s overloaded operator and `getAs` is `Type::getAs`.

Additionally, array types are special - see doxygen comments for `ASTContext::getAsArrayType()`. For some reason you're suppposed to consult ASTContext when casting them, which I did in the suggested fix. I'm not sure if it matters in this case when we just want to extract the size.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:780-781
+  } else {
+    // In cases `Init` is of the form `&Var` after stripping of implicit
+    // casts, where `&` is the built-in operator, the extent is 1.
+    if (auto AddrOfExpr = dyn_cast<UnaryOperator>(Init->IgnoreImpCasts()))
----------------
```lang=c
int x = 1;
char *ptr = &x; // std::span<char> ptr { &x, 4 };
```
This is valid code. I suspect we want to check types as well, to see that type sizes match.

Most of the time code like this violates strict aliasing, but `char` is exceptional, and even if it did violate strict aliasing, people can compile with `-fno-strict-aliasing` to define away the UB, so we have to respect that.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:818
+static FixItList fixVarDeclWithSpan(const VarDecl *D, const ASTContext &Ctx) {
+  const QualType &T = D->getType().getDesugaredType(Ctx);
+  const QualType &SpanEltT = T->getPointeeType();
----------------
`getDesugaredType()` looks redundant, `getPointeeType()` probably already does that.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:856-858
+  assert(DS && "Fixing non-local variables not implemented yet!");
+  assert(DS->getSingleDecl() == VD &&
+         "Fixing non-single declarations not implemented yet!");
----------------
I suspect this code will crash whenever such declarations are encountered, so this is probably not something we want to commit.

Typically `assert(..., "not implemented yet")` makes sense only when the crashing codepath is also unreachable because no other facility in the compiler exercises it.

It probably make sense to `return {}` in these cases: the function is still responsible for them, we just haven't implemented them yet.

Let's also add a FIXME test to demonstrate that there's no fix yet in such situations.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:892
+    // If we fail to produce Fix-It for the declaration we have to skip the variable entirely.
+    if (FixItsForVariable[VD].empty()) {
+      FixItsForVariable.erase(VD);
----------------
Interesting, so you don't need to discriminate between situation "fix is impossible" and situation "fix is not necessary" because the latter is impossible because the variable type definitely needs fixing? Which is different from the behavior of `FixableGadget::getFixits()` that returns an empty optional when the fix is impossible and a non-empty optional with zero fixes when the fix is not necessary. I guess let's leave a comment at `fixVariable()` documenting this behavior?

I also suspect that eventually we'll want variable declarations to simply act as another fixable gadget. In this case we might have to undo this solution and go back to optionals.


================
Comment at: clang/test/SemaCXX/warn-unsafe-buffer-usage-fixits-local-var-span.cpp:55
+  int *p = new int[n];
+  // If the extent expression does not have a constant value, we cannot fill the extent for users...
+  int *q = new int[n++];
----------------
In the `new int[n]` case it also technically doesn't have a constant value. The code correctly checks for side effects, the comment should also probably say something about side effects.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139737/new/

https://reviews.llvm.org/D139737



More information about the cfe-commits mailing list