[PATCH] D124446: [clang-tidy] Add the misc-discarded-return-value check

Mon May 16 06:44:40 PDT 2022

whisperity added inline comments.

================
Comment at: clang-tools-extra/clang-tidy/misc/DiscardedReturnValueCheck.cpp:83-86
+void DiscardedReturnValueCheck::onStartOfTranslationUnit() {
+  ConsumedCalls.clear();
+  CallMap.clear();
+}
----------------
aaron.ballman wrote:
> Is this code necessary?
Yes, if you have checks that store TU-specific data, and you run `clang-tidy a.cpp b.cpp` then if you do not clear the data structures, dangling pointers into already destroyed ASTs will remain.

================
Comment at: clang-tools-extra/clang-tidy/misc/DiscardedReturnValueCheck.cpp:181
+
+  static const auto Decltype = decltypeType(hasUnderlyingExpr(Call));
+  static const auto TemplateArg =
----------------
aaron.ballman wrote:
> So, I'm not super keen on this approach of having to try to identify every single place in which an expression is considered to be "used" -- this is going to be fragile because we'll miss places and it's going to be a maintenance burden because new places will be added as the languages evolve.
> 
> For example, if we're handling `decltype` as a use, why not `noexcept`? Or conditional `explicit`? What about a `co_return` statement?
> 
> I'm not certain what we can do to improve this, but I think it's worth trying to explore options to see if we can generalize what constitutes a use so that we can write a few custom matchers to do the heavy lifting instead of trying to play whack-a-mole.
I've been having other thoughts about this `decltype` here... Actually, neither `decltype` nor `noexcept` should be handled as a //"use"// at all, while `co_return` should be the same as a `return` -- however, I think it was due to lack of projects where such could be meaningfully measured as a missed case was why implementing that failed.

For `decltype`, `typedef`, and `noexcept` (and perhaps several others), the good solution would be having a third route: calls that //should not be counted//. Neither as a "consumed call", nor as a "bare call". Ignored, from both calculations. Maybe even for template arguments below.

================
Comment at: clang-tools-extra/clang-tidy/misc/DiscardedReturnValueCheck.cpp:181
+
+  static const auto Decltype = decltypeType(hasUnderlyingExpr(Call));
+  static const auto TemplateArg =
----------------
whisperity wrote:
> aaron.ballman wrote:
> > So, I'm not super keen on this approach of having to try to identify every single place in which an expression is considered to be "used" -- this is going to be fragile because we'll miss places and it's going to be a maintenance burden because new places will be added as the languages evolve.
> > 
> > For example, if we're handling `decltype` as a use, why not `noexcept`? Or conditional `explicit`? What about a `co_return` statement?
> > 
> > I'm not certain what we can do to improve this, but I think it's worth trying to explore options to see if we can generalize what constitutes a use so that we can write a few custom matchers to do the heavy lifting instead of trying to play whack-a-mole.
> I've been having other thoughts about this `decltype` here... Actually, neither `decltype` nor `noexcept` should be handled as a //"use"// at all, while `co_return` should be the same as a `return` -- however, I think it was due to lack of projects where such could be meaningfully measured as a missed case was why implementing that failed.
> 
> For `decltype`, `typedef`, and `noexcept` (and perhaps several others), the good solution would be having a third route: calls that //should not be counted//. Neither as a "consumed call", nor as a "bare call". Ignored, from both calculations. Maybe even for template arguments below.
As for better matching... Unfortunately, types in the AST are so varied and `hasDescendant` is too generic to express something like `stmt(anyOf(ifStmt(), forStmt(), switchStmt()), hasDescendant(Call))` to express in a single expression matching uses... The conditions are not always direct children of the outer node, while `hasDescendant` will match not just the condition but the entire tree... resulting in things like //both// functions in

```lang=cpp
if (foo())
  bar()
```

matching.

Well... generalisation... I can throw in a formal fluke:

> A **use** is a //context// for a specific `CallExpr C` in which we can reasonably assume that the value produced by evaluating `C` is loaded by another expression.

Now what I found is `-Wunused-result`, aka `SemaDiagnostics::warn_unused_expr`, which is triggered in the function `ExprResult Sema::ActOnFinishFullExpr(Expr* FE, SourceLocation CC, bool DiscardedValue, bool IsConstexpr);`. Now this function itself does //some// heuristics inside (with a **lot** of `FIXME`s as of rGdab5e10ea5dbc2e6314e0e7ce54a9c51fbcb44bd), but notably, `DiscardedValue` is a parameter. According to a quick search, this function (and its overloads) have **82** callsites within `Sema`, with many of them just tougher to decipher than others. Some of the other ways this function is called, e.g. `ActOnStmtExprResult`, have codes like this:

```lang=cpp
IsStmtExprResult = GetLookAheadToken(LookAhead).is(tok::r_brace) && GetLookAheadToken(LookAhead + 1).is(tok::r_paren);
```

So I would say most of the logic there is **very** parsing specific, and requires information that is only available during the parsing descent, and not later when someone tries to consume a `const AST`.

================
Comment at: clang-tools-extra/docs/clang-tidy/checks/misc-discarded-return-value.rst:6-7
+
+Flags function calls which return value is discarded if most of the other calls
+to the function consume the return value.
+
----------------
aaron.ballman wrote:
> You should also make it clear in the docs that this only considers the statistics of a single translation unit.
> 
> (I suspect you'll get far more clean results if the check was run over a whole program instead of just a single TU, but I'm not certain if we've made that situation sufficiently simple in clang-tidy yet to be worth trying to support.
Indeed, it seems splitting the work into separate matches missed a few cases here and there, like in the tests.

(However, D124447 and D124448 aims to deal with making the analysis project-level! 😉)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124446/new/

https://reviews.llvm.org/D124446