[PATCH] D69560: [clang-tidy] Add 'bugprone-easily-swappable-parameters' check

Mon Jun 28 12:29:03 PDT 2021

alexfh added a comment.

I thought I sent this batch of comments loong ago =\

Better late than never.

================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp:113-120
+#define FB(Name, K) MIX_##Name = (1ull << (K##ull - 1ull))
+
+  FB(None, 1),      //< Mix between the two parameters is not possible.
+  FB(Trivial, 2),   //< The two mix trivially, and are the exact same type.
+  FB(Canonical, 3), //< The two mix because the types refer to the same
+                    // CanonicalType, but we do not elaborate as to how.
+
----------------
whisperity wrote:
> alexfh wrote:
> > I'd switch to scoped enums (`enum class`), remove the macro, and use simpler code like `None = 0x1, Trivial = 0x2, Canonical = 0x4`, etc.
> `enum class` is a good idea!
> 
> Unfortunately, I'm not really sold on the hexa literals... It makes the code confusing the read, because you have to keep in mind which range the hexa literals mean, and when there is a shift in position, e.g. if you read `0x20` you have to count out that okay, that's 2nd position, so at least 16, so at least the 4th bit, but it is two, so you shift it again to the right, so in the bit pattern it turns on the 5th bit actually.
> It makes it very easy to misread or misinterpret things, especially when trying to work on the source code.
> (In the last extension patch, the number of elements in the bit mask go up to the 8th or the 9th bit, I don't remember exactly at the top of my head.)
> 
> Compared to this, the current text when read immediately tells you which "decimal" position you should expect the appropriate bit to be toggled on, if the flag is there.
> 
> Binary literals, on the other hand, would be amazing to use here, sadly we're only in C++14.
First of all, do we ever need to know the position of the set bit in these flag enumerators? Isn't it enough that each enumerator has a distinct bit set? If so, there's not much complexity in reading and expanding sequences like `0x0001, 0x0002, 0x0004, 0x0008, 0x0010, 0x0020, 0x0040, 0x0080, 0x0100, 0x0200, ...`.
However, if that's not to your taste, there's another frequently used style of defining flag enumerations in LLVM: `1 << 0, 1 << 1, 1 << 2, 1 << 3, ...`.

Either of these is fine, but using macros for this purpose is not necessary and doesn't make the code any easier to read or to modify.

> Compared to this, the current text when read immediately tells you which "decimal" position you should expect the appropriate bit to be toggled on, if the flag is there.
Do I understand correctly that you need this exclusively for the purpose of reading debug logs? If so, it seems to be better to decode the bits before logging them. See the comment on line 87 above.

================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp:133
+
+  void sanitize() {
+    assert(Flags != MIX_Invalid && "sanitize() called on invalid bitvec");
----------------
whisperity wrote:
> alexfh wrote:
> > What should this method do and how it should be used? Same for Mix::sanitize()
> Once there are more flags in the bitmask (not just "None" and "Trivial"), this method deals with sanitising the flags that are toggled on and off by the recursive descent on the type tree.
> 
> For example, if there is a typedef involved (see D95736), it will toggle on the typedef flag, and start investigating the aliased type. This subsequent investigation might figure out that the aliased type is trivial, or it is distinct. In case it is distinct, that level of the recursion will turn on the "Not mixable" flag. In this case, when the investigation returns completely, we will have "Typedef" and "None" both turned on. This method will throw away everything but the "None" bit, so when the data is returned to the diagnostic-generating part of the check, it will know what to do.
Thanks for the explanation! But it would be very helpful to have this information in the code, not only in a code review comment.

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/bugprone-easily-swappable-parameters-len2.cpp:129
+
+void qualified1(int I, const int CI) {} // NO-WARN: Not the same type.
+
----------------
whisperity wrote:
> alexfh wrote:
> > Does the top-level const change anything with regard to the "mixability"? It's a purely implementation-side difference, callers could still swap parameters.
> Unfortunately, coding styles differ whether people mark local variables const or not, across projects. Even with pointers: in some cases, a swapped call to `memcpy` might be caught if the user made one of the pointers passed already `const void*` instead of just `void*`.
> 
> D95736 deals with implementing the logic for this, it is exposed as a user-configurable option whether they want to consider const stuff the same as non-const stuff.
The `const` in `const void *` is not top-level, it applies to the pointed at type. A top-level const here would be `const void * const` (a constant pointer to a constant void) or `void * const` - a const pointer to a non-const void. Now getting back to the original question. Consider this function:
```
void qualified1(int I, const int CI) {}
```

The top-level `const` for the second parameter isn't a part of the function signature and doesn't help with prevention of the unintentionally swapped parameters. I guess, the same goes for top-level `volatile`.

Whatever arguments x and y are passed to `qualified1` they can be swapped without being noticed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69560/new/

https://reviews.llvm.org/D69560