[PATCH] D20689: [clang-tidy] Add 'readability-suspicious-call-argument' check

Whisperity via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 18 05:38:03 PDT 2021


whisperity added inline comments.


================
Comment at: clang-tools-extra/clang-tidy/readability/SuspiciousCallArgumentCheck.cpp:31-32
+
+  char DissimilarBelow;
+  char SimilarAbove;
+
----------------
aaron.ballman wrote:
> `signed char` since we're doing `> -1` below? Or better yet, `int8_t` because these aren't really characters?
Oh right, I always forget `char` isn't guaranteed. `int8_t` seems like a better idea anyways (until we have `int6_t`...)


================
Comment at: clang-tools-extra/clang-tidy/readability/SuspiciousCallArgumentCheck.cpp:51
+        {true, 40, 50}, // Substring.
+        {true, 50, 66}, // Levenshtein.
+        {true, 75, 85}, // Jaro-Winkler.
----------------
varjujan wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > We should probably document where all these numbers come from, but `66` definitely jumps out at me as being a bit strange. :-D
> > Unfortunately, I have absolutely no idea. All these values are percentages between 0 and 100 (`-1` is just saying that //"This heuristic doesn't accept percentages"//), and this is written in the documentation now. However, the answer to //"**why** 66%?"//, unless @varjujan can say something, I think is lost to history...
> > 
> > I'll read over his thesis once again, maybe I can find anything with regards to this.
> > 
> > Either way, I've detailed from both the code and the thesis how the percentages are meant. In some cases, the % is calculated as //"% of the longer string's length"//. In the Leventhstein's case, it's actually inverted:
> > 
> > ```
> > Dist = (1 - Dist / LongerLength) * 100;
> > ```
> > 
> > So what this says is that if the current arg1-param1 arg2-param2 pairing has less than the inverse of 50% (which is more than 50%) of the longer string's edit distance, but the arg2-param1 and arg1-param2 (the suggested swapped order) has more than the inverse of 66% (which is less than 33%), then the swap will be suggested.
> > 
> > Originally these values were called `LowerBound` and `UpperBound`, respectively, which was saying **even less** about what they mean...
> Sadly, I think there isn't any scientific reason behind these numbers. They just looked ok after a couple of test runs. (Maybe they make sense for shorter arg names, like the 66 for 3 char long names.)
In [Rice2017], the inflexion point of the precision/recall plot is at around `0.55`-ish threshold. They express this threshold as the distance of distances, i.e. `0` would mean the (a1, p1) - (a2, p2) pair is good as it is, and `1` would mean that it should **definitely** be (a2, p1) - (a1, p2) instead. 


================
Comment at: clang-tools-extra/clang-tidy/readability/SuspiciousCallArgumentCheck.cpp:347
+// Checks whether ArgType converts implicitly to ParamType.
+static bool areTypesCompatible(QualType ArgType, QualType ParamType,
+                               const ASTContext &Ctx) {
----------------
whisperity wrote:
> aaron.ballman wrote:
> > It seems like we're doing an awful lot of the same work as `ASTContext::typesAreCompatible()` and type compatibility rules are pretty complex, so I worry about this implementation being different than the `ASTContext` implementation. Have you explored whether we can reuse more of the logic from `ASTContext` here, or are they doing fundamentally different kinds of type compatibility checks?
> No, I didn't know that function even existed. This check must be older than that function.
Actually, no, that function is pretty old... However, that function, and all the function it subsequently calls, require a **non-const** `ASTContext`. I have changed `ASTContext`:

```
╰─ git diff --cached --stat
 clang/include/clang/AST/ASTContext.h                                     | 67 ++++++++++++++++++++++++++++++++++++-------------------------------
 clang/lib/AST/ASTContext.cpp                                             | 72 +++++++++++++++++++++++++++++++++++++-----------------------------------
 2 files changed, 73 insertions(+), 66 deletions(-)
```

making related member functions and internal static functions take `const ASTContext &`/`*`.

This, by itself, did not break any of the tests of `check-clang check-clang-unit check-clang-tools check-clang-extra-unit`!


================
Comment at: clang-tools-extra/clang-tidy/readability/SuspiciousCallArgumentCheck.cpp:347-349
+static bool areTypesCompatible(QualType ArgType, QualType ParamType,
+                               const ASTContext &Ctx) {
+  if (ArgType.isNull() || ParamType.isNull())
----------------
whisperity wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > It seems like we're doing an awful lot of the same work as `ASTContext::typesAreCompatible()` and type compatibility rules are pretty complex, so I worry about this implementation being different than the `ASTContext` implementation. Have you explored whether we can reuse more of the logic from `ASTContext` here, or are they doing fundamentally different kinds of type compatibility checks?
> > No, I didn't know that function even existed. This check must be older than that function.
> Actually, no, that function is pretty old... However, that function, and all the function it subsequently calls, require a **non-const** `ASTContext`. I have changed `ASTContext`:
> 
> ```
> ╰─ git diff --cached --stat
>  clang/include/clang/AST/ASTContext.h                                     | 67 ++++++++++++++++++++++++++++++++++++-------------------------------
>  clang/lib/AST/ASTContext.cpp                                             | 72 +++++++++++++++++++++++++++++++++++++-----------------------------------
>  2 files changed, 73 insertions(+), 66 deletions(-)
> ```
> 
> making related member functions and internal static functions take `const ASTContext &`/`*`.
> 
> This, by itself, did not break any of the tests of `check-clang check-clang-unit check-clang-tools check-clang-extra-unit`!
@aaron.ballman Changing the function to be 

```
return Ctx.typesAreCompatible(ArgType, ParamType);
```

will make the checker miss the test case about `T`/`const T&` mixup.

```
void value_const_reference(int llllll, const int& kkkkkk);
void const_ref_value_swapped() {
  const int& kkkkkk = 42;
  const int& llllll = 42;
  value_const_reference(kkkkkk, llllll);
  // error: CHECK-MESSAGES: expected string not found in input:
  // warning: 1st argument 'kkkkkk' (passed to 'llllll') looks like it might be swapped with the 2nd, 'llllll' (passed to 'kkkkkk')
}
```

----

Setting `bool CompareUnqualified = true` (3rd argument to `typesAreCompatible`) doesn't help either.
Which is valid from `typesAreCompatible`'s perspective... that function answer the question, applied to the context of the above test: //"Is `llllll = kkkkkk;` valid?"//, which is obviously **false** as both are `const T&`s.


================
Comment at: clang-tools-extra/clang-tidy/readability/SuspiciousCallArgumentCheck.h:38
+  enum class Bound {
+    /// The percentage below which the heuristic deems the strings
+    /// dissimilar.
----------------
aaron.ballman wrote:
> The comment is somewhat confusing because the enumerators have the values 0 and 1, which are valid percentages but not likely what the comment means.
I'll rename it to `BoundKind` and update the comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D20689/new/

https://reviews.llvm.org/D20689



More information about the cfe-commits mailing list