[PATCH] D69560: [clang-tidy] Add 'bugprone-easily-swappable-parameters' check

Whisperity via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Feb 2 06:55:54 PST 2021


whisperity marked 6 inline comments as done.
whisperity added inline comments.


================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp:255
+  std::string NodeTypeName =
+      Node->getType().getAsString(Node->getASTContext().getPrintingPolicy());
+  StringRef CaptureName{NodeTypeName};
----------------
aaron.ballman wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > Hmm, one downside to using the printing policy to get the node name is that there can be alternative spellings for the same type that the user might reasonably expect to be applied. e.g., if the user expects to ignore `bool` datatypes and the printing policy wants to spell the type as `_Bool`, this won't behave as expected.
> > But then the diagnostic is inconsistent with what the compiler is configured to output as diagnostics, isn't it? Can I stringify through Clang with some "default" printing policy?
> The situation I'm worried about is something like this in C code:
> ```
> #include <stdbool.h>
> 
> void func(bool b, bool c) {}
> ```
> because the `bool` in the function signature will expand to `_Bool` after the preprocessor gets to it (because of the contents of `<stdbool.h>`). If the user writes `bool` in their configuration file because that's what they've written in their function signatures, will this fail because the printing policy says it should be `_Bool`?
> 
> Along similar lines, I wonder about things like `struct S` vs `S` (in C), `signed char` vs `char`, types within anonymous namespaces (where we print `<anonymous namespace>`), etc. Basically, any time when the printing policy may differ from what the user lexically writes in code.
Meanwhile, I realised we are talking of two entirely distinct things. I will look into how this `PrintingPolicy` stuff works... I agree that the ignore rules (where the comment was originally posted) should respect the text written into the code as-is. The diagnostics printing type names could continue using the proper printing policy (to be in line with how Sema diagnoses, AFAIK).


================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp:402-403
+
+    // Unfortunately, undersquiggly highlights are for some reason chewed up
+    // and not respected by diagnostics from Clang-Tidy. :(
+    diag(First->getLocation(), "the first parameter in the range is '%0'",
----------------
aaron.ballman wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > Can you post a screenshot of what you mean?
> > Given
> > 
> > ```
> > Diag << SourceRange{First->getSourceRange().getBegin(), Last->getSourceRange().getEnd()};
> > ```
> > 
> > The diagnostic is still produced with only a `^` at the original `diag(Location)`, ignoring the fed `SourceRange`:
> > 
> > ```
> > /home/whisperity/LLVM/Build/../llvm-project/clang-tools-extra/test/clang-tidy/checkers/bugprone-easily-swappable-parameters-len2.cpp:21:18: warning: 3 adjacent parameters of 'redeclChain' of similar type ('int') are easily swapped by mistake [bugprone-easily-swappable-parameters]
> > void redeclChain(int I, int J, int K) {}
> >                  ^
> > ```
> > 
> > Instead of putting all the squigglies in as the range-location highlight, like how `Sema` can diagnose:
> > 
> > ```
> > /home/whisperity/LLVM/Build/../llvm-project/clang-tools-extra/test/clang-tidy/checkers/bugprone-easily-swappable-parameters-len2.cpp:21:18: warning: 3 adjacent parameters of 'redeclChain' of similar type ('int') are easily swapped by mistake [bugprone-easily-swappable-parameters]
> > void redeclChain(int I, int J, int K) {}
> >                  ^~~~~~~~~~~~~~~~~~~
> > ```
> > 
> > I would not want to put additional note tags in an otherwise already verbose output.
> > 
> > Back in 2019 I was investigating this issue specifically for another checker I was working on, but hit the wall... Somewhere deep inside where Tidy diagnostic stuff is translated and consumed to Clang stuff, it goes away. It shouldn't, because there are statements that seemingly copy the `Diag.Ranges` array over, but it still does not come out.
> > 
> > ... EDIT: I found [[ http://lists.llvm.org/pipermail/cfe-dev/2019-October/063676.html | the relevant mailing list post ]].
> > ... EDIT: I found the relevant mailing list post.
> 
> Oh wow, I had no idea! That's a rather unfortunate bug. :-(
Speaking of bugs, shall I put it up on Bugzilla, or see if anyone reported already? Would require me digging into that "where are the data structures copied" kind of thing to figure it out again.


================
Comment at: clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.h:30
+  void storeOptions(ClangTidyOptions::OptionMap &Opts) override;
+
+  /// The minimum length of an adjacent swappable parameter range required for
----------------
aaron.ballman wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > Do these data members need to be public?
> > Yes. The modelling heuristics will have a bunch of recursive functions which all need the config variables (okay, not the `MinimumLength` one, but the other 4 or 5 that will be introduced) and giving them every time by copy would introduce its own adjacent swappable `bool, bool` parameter sequence issue.
> > 
> > The previous full version of the check which I'm incrementally reworking has signatures like:
> > 
> > ```
> > static MixupData HowPossibleToMixUpAtCallSite(const QualType LType,
> >                                               const QualType RType,
> >                                               const ASTContext &Ctx,
> >                                               const bool CVRMixPossible,
> >                                               const bool ImplicitConversion,
> >                                               const bool IsUserDefinedConvertersOtherwisePossible);
> > 
> > static MixupData RefBindsToSameType(const LValueReferenceType *LRef,
> >                                     const Type *T, const ASTContext &Ctx,
> >                                     const bool IsRefRightType,
> >                                     const bool CVRMixPossible,
> >                                     const bool ImplicitConversionEnabled);
> > ```
> That's understandable logic, but then we're exposing essentially what amount to be implementation details. So it sounds like we're introducing a new code maintenance issue in order to solve another code maintenance issue. :-(
> 
> That said, I don't think leaving these as `public` is a deal-breaker.
These variables are `const` and "directly" created from the user-facing configuration arguments. But in the end once we are done with the implementation and everything is good to go and we see how many functions and what configuration variables are actually needed where, we can look into moving them into some internal struct again. But it feels rather unlikely.


================
Comment at: clang-tools-extra/test/clang-tidy/checkers/bugprone-easily-swappable-parameters-len2.cpp:10
+
+void declaration(int Param, int Other); // NO-WARN: No chance to change this function.
+
----------------
aaron.ballman wrote:
> whisperity wrote:
> > aaron.ballman wrote:
> > > I think this is a case where we could warn when the declaration is outside of a system header (perhaps as an option).
> > > 
> > > Thinking about it a bit more, declarations and definitions provide a novel way to get another kind of swap:
> > > ```
> > > void func(int x, int y);
> > > void func(int y, int x) {} // Oops, the swap was probably not intentional
> > > ```
> > > which may or may not be interesting for a check (or its heuristics).
> > I gave this some thought. It is a very good idea, but I believe not for //this// check, but D20689. What do you think of that? Maybe simply saying "call site v. function node that was called" could be extended with a completely analogous, string distance function based "function definition node v. redecl chain" case.
> Hmmm, my impression is that this check is fundamentally about *parameter* ordering whereas D20689 is about *argument* ordering. Given that the example is about the ordering of parameters and doesn't have anything to do with call sites, I kind of think the functionality would be better in a parameter-oriented check.
> 
> That said, it doesn't feel entirely natural to add it to this check because this check is about parameters that are easily swappable *at call sites*, and this situation is not exactly that. However, it could probably be argued that it is appropriate to add it to this check given that swapped parameters between the declaration and the definition are likely to result in swapped arguments at call sites.
> 
> Don't feel obligated to add this functionality to this check (or invent a new check), though.
It just seems incredibly easier to put it into the other check, as that deals with names. But yes, then it would be a bit weird for the "suspicious call argument" check to say something about the definition... This check (and the relevant Core Guideline) was originally meant to consider **only** types (this is something I'll have to emphasise more in the research paper too!). Everything that considers //names// is only for making the check less noisy and make the actually emitted results more useful. However, I feel we should //specifically// not rely on the names and the logic too much.

But(!) your suggestion is otherwise a good idea. I am not sure if the previous research (with regards to names) consider the whole "forward declaration" situation, even where they did analysis for C projects, not just Java.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69560/new/

https://reviews.llvm.org/D69560



More information about the cfe-commits mailing list