[PATCH] D69560: [clang-tidy] Add 'bugprone-easily-swappable-parameters' check

Whisperity via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 8 07:57:55 PST 2021


whisperity added inline comments.


================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp:177
+                                       const std::size_t StartIndex) {
+  const std::size_t NumParams = FD->getNumParams();
+  assert(StartIndex < NumParams && "out of bounds for start");
----------------
aaron.ballman wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > Some interesting test cases to consider: varargs functions and K&R C functions
> > > K&R C functions
> > 
> > Call me too young, but I had to look up what a "K&R C function" is, and I am absolutely baffled how this unholy construct is still supported! **But:** thanks to Clang supporting it properly in the AST, the checker works out of the box!
> > 
> > Given
> > 
> > ```
> > int foo(a, b)
> >   int a;
> >   int b;
> > {
> >   return a + b;
> > }
> > ```
> > 
> > We get the following output:
> > 
> > ```
> > /tmp/knr.c:2:3: warning: 2 adjacent parameters of 'foo' of similar type ('int') are easily swapped by mistake [bugprone-easily-swappable-parameters]
> >   int a;
> >   ^
> > /tmp/knr.c:2:7: note: the first parameter in the range is 'a'
> >   int a;
> >       ^
> > /tmp/knr.c:3:7: note: the last parameter in the range is 'b'
> >   int b;
> >       ^
> > ```
> > 
> > (even the locations are consistent!)
> > 
> > Should I add a test case for this? We could use a specifically C test case either way eventually...
> > 
> > -----
> > 
> > > varargs functions
> > 
> > This is a bit of terminology, but something tells me you meant the //variadic function// here, right? As opposed to type parameter packs.
> > 
> > Given
> > 
> > ```
> > int sum(int ints...) {
> >   return 0;
> > }
> > ```
> > 
> > the AST looks something like this:
> > 
> > ```
> > `-FunctionDecl 0x56372e29e258 <variadic.cpp:1:1, line:3:1> line:1:5 sum 'int (int, ...)'
> >   |-ParmVarDecl 0x56372e29e188 <col:9, col:13> col:13 ints 'int'
> > ```
> > 
> > Should we diagnose this? And if so, how? The variadic part is not represented (at least not at first glance?) in the AST. Understanding the insides of such a function would require either overapproximatic stuff and doing a looot of extra handling, or becoming flow sensitive... and we'd still need to understand all the `va_` standard functions' semantics either way.
> > Call me too young, but I had to look up what a "K&R C function" is, and I am absolutely baffled how this unholy construct is still supported!
> 
> Ah, to be innocent again, how I miss those days. :-D
> 
> > Should I add a test case for this? We could use a specifically C test case either way eventually...
> 
> I think it'd be a useful case, but the one I was specifically more concerned with is:
> ```
> // N.B.: this is C-specific and does not apply to C++.
> void f();
> 
> int main(void) {
>   f(1, 2, 3.4, "this is why we can't have nice things");
> }
> ```
> where the function has no prototype and so is treated as a varargs call.
> 
> > This is a bit of terminology, but something tells me you meant the variadic function here, right? As opposed to type parameter packs.
> 
> Yes, sorry for being unclear, I am talking about variadic functions.
> 
> > Should we diagnose this? And if so, how? The variadic part is not represented (at least not at first glance?) in the AST. Understanding the insides of such a function would require either overapproximatic stuff and doing a looot of extra handling, or becoming flow sensitive... and we'd still need to understand all the va_ standard functions' semantics either way.
> 
> Well, that's what I'm wondering, really. The arguments are certainly easy to swap because the type system can't help the user to identify swaps without further information (like format specifier strings). However, the checking code would be... highly unpleasant, I suspect. My intuition is to say that we don't support functions without prototypes at all (we just silently ignore them) and that we only check the typed parameters in a variadic function declaration (e.g., we'll diagnose `void foo(int i, int j, ...);` because of the sequential `int` parameters, but we won't diagnose `void foo(int i, ...);` even if call sites look like `foo(1, 2);`). WDYT?
It is definitely highly unpleasant, at one point I remember just glancing into the logic behind `printf()` related warnings in Sema and it was... odd, to say the least.

That is how the checker works right now. It will not diagnose `int f() {}` because from the looks of the function definition, it is a 0-parameter function. Same with the variadic `...`, it is a single parameter function (which has a rather //weird// parameter type).

But yes, I think I'll make this explicit in the documentation (and FWIW, the research paper, too). 

> However, the checking code would be... highly unpleasant, I suspect.

Incredibly, because this check purposefully targets the function definition nodes only, like a cruise missile. For that logic, we would need to gather call sites for variadic functions and start doing some more magic with it.

FYI, templates are also handled in a way that only what is clear from definitions (and not instantiations... I think we can kind of call a particular "overload" call to a variadic like a template instantiation, just for the sake of the argument here) only:

```
template <typename T, typename U>
void f(T t1, T t2, U u) {} // [t1, t2] same type, WARN.

void g() { f(1, 2, 3); }} // NO-WARN: Even though "U" in this call is also "int", this is not fixed "easily enough" in the definition, so to prevent more false positives, we shut up.

template <>
void f(int i, int j, int k) {} // [i, k] WARN.
```

Anything that is only known through template instantiations is context-sensitive (the subsequent patch about typedef extends the tests and diagnostics about this), and thus, we forego trying to find the way.

```
template <typename T> struct vector { using value_type = T; };

template <typename T>
void f(T t, typename vector<T>::element_type u) {} // NO-WARN, dependent.

template <>
void f(int i, vector<int>::element_type i2) {} // WARN, can explicitly unfold the typedef and see "int == int".

template <typename T>
void g(typename vector<T>::element_type e1, typename vector<T>::element_type e2) {} // WARN, not dependent.
```


================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp:177
+                                       const std::size_t StartIndex) {
+  const std::size_t NumParams = FD->getNumParams();
+  assert(StartIndex < NumParams && "out of bounds for start");
----------------
whisperity wrote:
> aaron.ballman wrote:
> > whisperity wrote:
> > > aaron.ballman wrote:
> > > > Some interesting test cases to consider: varargs functions and K&R C functions
> > > > K&R C functions
> > > 
> > > Call me too young, but I had to look up what a "K&R C function" is, and I am absolutely baffled how this unholy construct is still supported! **But:** thanks to Clang supporting it properly in the AST, the checker works out of the box!
> > > 
> > > Given
> > > 
> > > ```
> > > int foo(a, b)
> > >   int a;
> > >   int b;
> > > {
> > >   return a + b;
> > > }
> > > ```
> > > 
> > > We get the following output:
> > > 
> > > ```
> > > /tmp/knr.c:2:3: warning: 2 adjacent parameters of 'foo' of similar type ('int') are easily swapped by mistake [bugprone-easily-swappable-parameters]
> > >   int a;
> > >   ^
> > > /tmp/knr.c:2:7: note: the first parameter in the range is 'a'
> > >   int a;
> > >       ^
> > > /tmp/knr.c:3:7: note: the last parameter in the range is 'b'
> > >   int b;
> > >       ^
> > > ```
> > > 
> > > (even the locations are consistent!)
> > > 
> > > Should I add a test case for this? We could use a specifically C test case either way eventually...
> > > 
> > > -----
> > > 
> > > > varargs functions
> > > 
> > > This is a bit of terminology, but something tells me you meant the //variadic function// here, right? As opposed to type parameter packs.
> > > 
> > > Given
> > > 
> > > ```
> > > int sum(int ints...) {
> > >   return 0;
> > > }
> > > ```
> > > 
> > > the AST looks something like this:
> > > 
> > > ```
> > > `-FunctionDecl 0x56372e29e258 <variadic.cpp:1:1, line:3:1> line:1:5 sum 'int (int, ...)'
> > >   |-ParmVarDecl 0x56372e29e188 <col:9, col:13> col:13 ints 'int'
> > > ```
> > > 
> > > Should we diagnose this? And if so, how? The variadic part is not represented (at least not at first glance?) in the AST. Understanding the insides of such a function would require either overapproximatic stuff and doing a looot of extra handling, or becoming flow sensitive... and we'd still need to understand all the `va_` standard functions' semantics either way.
> > > Call me too young, but I had to look up what a "K&R C function" is, and I am absolutely baffled how this unholy construct is still supported!
> > 
> > Ah, to be innocent again, how I miss those days. :-D
> > 
> > > Should I add a test case for this? We could use a specifically C test case either way eventually...
> > 
> > I think it'd be a useful case, but the one I was specifically more concerned with is:
> > ```
> > // N.B.: this is C-specific and does not apply to C++.
> > void f();
> > 
> > int main(void) {
> >   f(1, 2, 3.4, "this is why we can't have nice things");
> > }
> > ```
> > where the function has no prototype and so is treated as a varargs call.
> > 
> > > This is a bit of terminology, but something tells me you meant the variadic function here, right? As opposed to type parameter packs.
> > 
> > Yes, sorry for being unclear, I am talking about variadic functions.
> > 
> > > Should we diagnose this? And if so, how? The variadic part is not represented (at least not at first glance?) in the AST. Understanding the insides of such a function would require either overapproximatic stuff and doing a looot of extra handling, or becoming flow sensitive... and we'd still need to understand all the va_ standard functions' semantics either way.
> > 
> > Well, that's what I'm wondering, really. The arguments are certainly easy to swap because the type system can't help the user to identify swaps without further information (like format specifier strings). However, the checking code would be... highly unpleasant, I suspect. My intuition is to say that we don't support functions without prototypes at all (we just silently ignore them) and that we only check the typed parameters in a variadic function declaration (e.g., we'll diagnose `void foo(int i, int j, ...);` because of the sequential `int` parameters, but we won't diagnose `void foo(int i, ...);` even if call sites look like `foo(1, 2);`). WDYT?
> It is definitely highly unpleasant, at one point I remember just glancing into the logic behind `printf()` related warnings in Sema and it was... odd, to say the least.
> 
> That is how the checker works right now. It will not diagnose `int f() {}` because from the looks of the function definition, it is a 0-parameter function. Same with the variadic `...`, it is a single parameter function (which has a rather //weird// parameter type).
> 
> But yes, I think I'll make this explicit in the documentation (and FWIW, the research paper, too). 
> 
> > However, the checking code would be... highly unpleasant, I suspect.
> 
> Incredibly, because this check purposefully targets the function definition nodes only, like a cruise missile. For that logic, we would need to gather call sites for variadic functions and start doing some more magic with it.
> 
> FYI, templates are also handled in a way that only what is clear from definitions (and not instantiations... I think we can kind of call a particular "overload" call to a variadic like a template instantiation, just for the sake of the argument here) only:
> 
> ```
> template <typename T, typename U>
> void f(T t1, T t2, U u) {} // [t1, t2] same type, WARN.
> 
> void g() { f(1, 2, 3); }} // NO-WARN: Even though "U" in this call is also "int", this is not fixed "easily enough" in the definition, so to prevent more false positives, we shut up.
> 
> template <>
> void f(int i, int j, int k) {} // [i, k] WARN.
> ```
> 
> Anything that is only known through template instantiations is context-sensitive (the subsequent patch about typedef extends the tests and diagnostics about this), and thus, we forego trying to find the way.
> 
> ```
> template <typename T> struct vector { using value_type = T; };
> 
> template <typename T>
> void f(T t, typename vector<T>::element_type u) {} // NO-WARN, dependent.
> 
> template <>
> void f(int i, vector<int>::element_type i2) {} // WARN, can explicitly unfold the typedef and see "int == int".
> 
> template <typename T>
> void g(typename vector<T>::element_type e1, typename vector<T>::element_type e2) {} // WARN, not dependent.
> ```
I definitely shall create and add a C file with test cases like this, marking them `// NO-WARN`, and explaining the reasoning. If for nothing else, maybe to know in the far future what can be improved upon.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69560/new/

https://reviews.llvm.org/D69560



More information about the cfe-commits mailing list