[PATCH] D102148: [clangd] Type hints for variables with 'auto' type

Sun May 30 22:41:49 PDT 2021

nridge added inline comments.

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:33
+    TypeHintPolicy.SuppressScope = true; // keep type names short
+    TypeHintPolicy.AnonymousTagLocations =
+        false; // do not print lambda locations
----------------
sammccall wrote:
> For lambda values, the overwhelmingly common case is `auto X = <lambda-expr>`.
> 
> A few reasons we may want to not show these at all (and so omit type hints for lambdas as a first approximation):
>  - the fact that it's a lambda is pretty obvious
>  - lambdas very often have complex introducers (captures/params) that a hint will displace/add noise to
>  - we may want to add type hints to captures-with-initializers, which are very much like auto-typed variables, and the fewer hints we have to cram on a line the better
> 
> (fine to consider this stuff later, just wanted to mention it)
There is some value in showing `(lambda)` as a hint: it tells you the initializer is not an IIFE (immediately-invoked function expression, i.e. `auto var = [](...){ ... }();` -- note the parentheses at the end). For multi-line lambdas, this information is not otherwise present on the first line of the declaration.

Granted, how useful this is depends on the code style, i.e. whether IIFEs are common or even allowed.

I'm going to keep this as-is for now, but I'm definitely open to revise this based on usage experience.

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:76
+  bool VisitVarDecl(VarDecl *D) {
+    // Structured binding - not handled yet.
+    if (isa<DecompositionDecl>(D))
----------------
sammccall wrote:
> FWIW I think hinting the individual bindings will be more useful and also more consistent with the other hints you have here.
Makes sense, I've updated the comment here and in the testcase to reflect this direction.

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:81
+    if (auto *AT = D->getType()->getContainedAutoType()) {
+      if (!D->getType()->isDependentType()) {
+        addInlayHint(D->getLocation(), InlayHintKind::TypeHint,
----------------
sammccall wrote:
> why this check vs checking whether AT is deduced?
> (thus fixing the dependent `T` testcase)
> At first I assumed this was the usual "AutoType in AST doesn't store the deduced type" problem, but shouldn't it work properly if we're starting at the variable type?
Not sure if I'm understanding the suggestion correctly, but if I make the condition `AT->isDeduced()`, I get an unwanted `auto` hint for `var1` in `TypeHints.DependentType` as well (and moreover, the hint for `var2` is also `auto` instead of the desired `T`).

================
Comment at: clang-tools-extra/clangd/InlayHints.cpp:82
+      if (!D->getType()->isDependentType()) {
+        addInlayHint(D->getLocation(), InlayHintKind::TypeHint,
+                     ": " + D->getType().getAsString(TypeHintPolicy));
----------------
sammccall wrote:
> This places the hint on the variable rather than the `auto`. And fittingly (but subtly), it prints type rather than just the placeholder type (e.g. `const auto& x = 42` => `const int&`, not `int`)
> 
> This decision is worth a comment, and probably a little discussion :-)
> 
> The alternative of annotating `auto` has advantages:
>  - the deduced `auto` type is the most obvious question when reading the code
>  - that `auto` deduces to a particular type is the "correct" model, and may lead to better language understanding. Extreme unrealistic example: in `auto x = foo(), *y = bar();` I'd strongly prefer one hint.
>  - the hint will often be shorter, which is an important consideration since we're reflowing column-limited text
>  - the annotated code more closely resembles code that does not use `auto`
> 
> And annotating the variable has advantages too:
>  - the type of the variable is the most directly important fact
>  - generalizes better to cases where `auto` is not present (lambda captures, individual structured bindings)
>  - i suspect simpler implementation
> 
> My intuition is to expect annotating `auto` to be a little nicer, mostly because it will come closer to making code read the same way whether it uses `auto` or not. I'm not certain though. Interested if you have strong feelings about this!
> 
> To save a round-trip: a likely conclusion is "maybe annotating auto is better, but it's not clear and/or it's a bunch of work". That's 100% fine, but we should just leave a comment whether this is something decided against vs desirable but deferred vs unsure.
Very good points here.

The argument I find most convincing for annotating the `auto` is the space savings, which are significant especially in the presence of `const`.

On the other hand, the arguments I find most convincing for the current approach are:

 * the generalization to lambda captures and structured bindings, as you mention
 * the familiarity to people coming from other languages like Typescript or Rust which have `let var = expr;` and who get a hint after the `var`

(Not sure if you view that second one as a valid consideration, but thought I'd mention it :))

For now, I've kept the current approach and documented it, but I'm definitely happy to reconsider based on usage experience / user feedback.

================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:445
+  assertTypeHints(R"cpp(
+    auto $L[[L]] = [](int a, int b) { return a + b; };
+  )cpp",
----------------
sammccall wrote:
> consider `= [b(1+1)](int a) { ... }` and we can consider adding type hints to b later
Good idea, and... turns out the existing code was already adding the hint for the init-capture!

================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:487
+}
+
+// Annoyances:
----------------
sammccall wrote:
> It'd be useful to have cases showing how we handle "bulky" typenames, e.g. `ns::OuterClass::Template<int>` which I think we render as `Template<int>`.
Added a case like this to `NoNamespaces` (and renamed that test case `NoQualifiers`).

================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:489
+// Annoyances:
+//  - auto x = dyn_cast<LongTypeName>(y);
+//  - stdlib algos return unwieldy __normal_iterator<X*, ...> type
----------------
sammccall wrote:
> sammccall wrote:
> > not sure if you meant dynamic_cast or dyn_cast here, both are important but distinct!
> maybe mention the other low-hanging fruit cases from the patch description too
I meant `llvm::dyn_cast`. It's not obvious to me what sort of heuristic would allow us to recognize things like that -- if you have ideas please feel free to mention.

Syntactic casts and built-in cast operators are indeed low-hanging fruit, I added mention of this in a FIXME here.

================
Comment at: clang-tools-extra/clangd/unittests/InlayHintTests.cpp:490
+//  - auto x = dyn_cast<LongTypeName>(y);
+//  - stdlib algos return unwieldy __normal_iterator<X*, ...> type
+
----------------
sammccall wrote:
> The use of a `__reserved` name might be a good hint this is a structural type whose name is not a meaningful hint. (I guess maybe that's already what you meant!)
I didn't really have a heuristic mind, though omitting the hint for `__reserved` names could be reasonable.

If we consider a case like:

```
  std::vector<Foo> vec;
  auto it2 = vec.erase(it1);
```

the hint is `__normal_iterator<Foo *, vector<Foo>>`. Ideally, I'd want just `Foo *`, but I guess we don't get to have that, so omitting it might be the next best thing.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102148/new/

https://reviews.llvm.org/D102148