[PATCH] D143971: [clang-tidy] Flag more buggy string constructor cases

Chris Cotter via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Mar 25 14:39:37 PDT 2023


ccotter marked an inline comment as done.
ccotter added inline comments.


================
Comment at: clang-tools-extra/clang-tidy/bugprone/StringConstructorCheck.cpp:70-74
+  const auto CharExpr = expr(anyOf(
+      ignoringParenImpCasts(characterLiteral()),
+      declRefExpr(hasDeclaration(varDecl(hasType(qualType(isAnyCharacter()))))),
+      ignoringParens(callExpr(
+          callee(functionDecl(returns(qualType(isAnyCharacter()))))))));
----------------
PiotrZSL wrote:
> add a testcase with:
> ```
> char& chRef = ch;
> std::string swapped2(chRef, i);
> 
> char* chPtr = &ch;
> std::string swapped2(*chPtr, i);
> 
> using Character = char;
> Character  c;
> std::string swapped2(c, i);
> ```
> 
> I fear that it may not match those, because here you are trying to match specific use cases, when you should simply match a type of expression. In such case this should be just:
> ```
> const auto CharExpr = expr(hasType(qualType(hasCanonicalType(anyOf(isAnyCharacter(), references(isAnyCharacter()))))));
> ```
> 
> Only issue would be implicit cast, you can try deal with them by using ignoringImplicit, or in better way just by using:
> 
> ```
> traverse(TK_IgnoreUnlessSpelledInSource,  hasArgument(0, CharExpr.bind("swapped-parameter"))),
> ```
> TK_IgnoreUnlessSpelledInSource will cut of all implicit operations, so you could check if called constructor is actually the wrong one, and check argument types. Because now macher in line 125 may not always work if somehow we would have for example 2 implicit casts, for example from char reference to char, or some other crap.
> 
> It's up to you, but consider more generic approach instead of matching specific use cases like references to variables, or calls.
> 
Thinking about this more, I think we have to keep the "args swapped" case very narrow to the existing patterns of the first arg being a char literal, and the second being an integer literal. Consider

```
char buffer[] = "abcdef";
char& chRef = ch;
std::string not_swapped(chRef, 4); // Forgot '&' in front of 'chRef'

std::string also_not_swapped(buffer[2], 2); // Forgot '&' in front of 'buffer[2]'
```

Neither of these are swapped arg bugs, so I don't think we can have the tool consider this a swapped arg bug and apply fixits. The `also_not_swapped` is the exact bug I wrote a few weeks ago.

I think there are two types of bugs the tool has to be aware of with respect to the string fill constructor
 1. swapped args
 2. "confusing" args - where the arguments are both implicitly cast from int to char and vice versa.

For swapped args, the existing logic will match when the first arg has `isInteger()` type, and the second arg is `characterLiteral()`. At most, I think we can only expand the logic to include constructor exprs when the second arg is a declRefExpr to a `char`, but not char ref (see the `not_swapped` example above), or when the second arg is the result of a function call that returns a char. In these cases, we can be sure that the the author did not mean to put a `&` in front of the second argument expression.

For "confusing" args, which is not implemented in the existing logic, but I am proposing in this change, this is the more general case where I think the logic should match any expression as long as both argument expressions are implicitly cast. This is "confusing" because int-to-char and char-to-int conversions are not obvious to the reader, and it should be rare for both conversions to take place when calling the fill constructor (confirmed anecdotally in the codebases I've checked). The tool could not offer fixits here since the fix might be to swap the args, or to add a `&` in front of the first arg (or even another change). At most, we can warn and alert the user to the problem. If this is what the author intended, they can use explicit casts to make the intent obvious to the reader.


================
Comment at: clang-tools-extra/clang-tidy/bugprone/StringConstructorCheck.cpp:95
+  const auto NonCharacterInteger =
+      qualType(isInteger(), unless(isAnyCharacter()));
+  const auto CharToIntCastExpr = implicitCastExpr(
----------------
PiotrZSL wrote:
> what about references to integer, or typedefs to integers ?
I'll add a test for reference to integer, but I'm matching on the type of the source expression that is being cast, which will be of integer type, even if the expression is `string str(x, y)` and `y` is an `int&` (the source expression in the cast will be of type `int`).


================
Comment at: clang-tools-extra/clang-tidy/bugprone/StringConstructorCheck.cpp:213
     }
+  } else if (const auto *E =
+                 Result.Nodes.getNodeAs<Expr>("implicit-cast-both-args")) {
----------------
PiotrZSL wrote:
> i thing that this case should be matched by "swapped-parameter", no need to add extra one, just first one should be fixed.
See earlier explanation on why we can't treat these all as swapped parameters cases.


================
Comment at: clang-tools-extra/docs/clang-tidy/checks/bugprone/string-constructor.rst:25
+  std::string str(buf[1], 5); // First arg should be '&buf[1]'?
+  std::string str2((int)buf[1], 5); // Ok - explicitly cast to express intent
 
----------------
carlosgalvezp wrote:
> carlosgalvezp wrote:
> > Should we use C++ casts?
> Should this be a char?
I'm not sure what you mean here - would you mind clarifying? Do you mean should this example cast the second arg '5' to char?


================
Comment at: clang-tools-extra/test/clang-tidy/checkers/bugprone/string-constructor.cpp:141
+  std::string s4((int)kText[1], i);
+  std::string s5(kText[1], (char)i);
 
----------------
carlosgalvezp wrote:
> Shouldn't this fail, since the constructor `std::string(char, char)` does not exist?
The new logic I wrote flags when both parameters are implicitly cast from int to char for the first arg, and char to int in the second argument. This specific situation is what I found to be most confusing and likely bug prone. I didn't include the case where only one argument is implicitly cast to be buggy, since I wanted to be more conservative in how many more expressions to flag as being buggy, and two implicit casts (which is a bug I wrote recently) is "more" bug prone than just one. We could expand it however - would you like to explore that option?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143971/new/

https://reviews.llvm.org/D143971



More information about the cfe-commits mailing list