[llvm-dev] [cfe-dev] RFC: Modernizing our use of auto

Sat Jan 5 15:30:07 PST 2019

On 05/01/2019 21:56, Mehdi AMINI wrote:
> You're discounting some aspects:

Not surprising! :)

Thanks for your contribution. Lots of things I hadn't considered here.

>
> - Code is written downstream without much consideration for the 
> guideline and then upstreamed

I hadn't considered this. Can you either quantify it, or point to some 
examples? I guess you're referring to some large commits which might not 
have been reviewed for particulars of style. Can you point me to some?

> - Review process isn't perfect: we let things slip, we don't always 
> want to nitpick on things like `auto`, etc.

Yes, even in that case, the submitter thinks the submitted code makes 
sense and the reviewer doesn't think it doesn't make sense. At least not 
so much that they'd block a commit because of it :).

> - Some developers just don't know the guidelines and because of the 
> point above, some cases slip.
> - People move back and forth between projects (i.e. LLVM is not their 
> main project) and thus aren't focused on every details of the code 
> guidelines (For instance I would review a patch today and not 
> necessarily catch on every style aspect).
>
> For the particular case of auto, there is an even more complex effect: 
> since the intention is that "types should be obvious when reading the 
> code", someone very familiar with some libraries will always 
> unconsciously infer the returned type from API calls (from 
> conventions, past experience, etc.), but someone who is less familiar 
> with this area of the codebase would be quickly confused.

Intermediate AST Matchers typically look like

  auto ConstrExpr = 
cxxConstructExpr(hasType(recordDecl(hasName(ClassName))));

`auto` is used exclusively when writing AST Matchers variables. No one 
complains about not reading the exact type of `ConstrExpr` in the code. 
And the people who work on those parts of the code tend to be the people 
most against all use of `auto`.

(You could write `StatementMatcher` instead of `auto` above if you 
wished, and `DeclarationMatcher`, `TypeMatcher` etc for other matchers. 
Unless you were familiar with all the implementation details of AST 
Matchers those type names wouldn't mean anything to you. In fact, 
they're not type names - they're just typedefs. Typedefs are what we 
used to use before we were able to use auto, when the actual type 
doesn't matter. You can look up the actual expansion of the typedef if 
you wish, but it won't make the above code clearer to you, because if 
you know what AST Matchers are, the exact type of `ConstrExpr` *does not 
matter*. Only the concept matters. It's an AST Matcher and it can be 
nested inside other AST Matchers and passed to Finder->addMatcher).

So, you don't see the exact type, and that doesn't affect your confusion 
about the above line. It could be `auto`, `StatementMatcher` or 
`internal::Matcher<Stmt>` without affecting your confusion. If you felt 
confusion about that line of code, it wasn't about the type.

> I claim that this example is less about `auto` itself but rather a 
> question about can you consider that `getAs` is such a "core" pattern 
> of this area of the code base that we can accept as "common knowledge" 
> that it always wrap the returned type in an optional.
> And if we do, then the current guideline is actually fulfilled: "the 
> type is already obvious from the context" (getAs being part of the 
> context at this point).

We probably don't need to dive into the details here. You will notice 
though that the conclusion from the same people who discussed that 
revisited it in https://reviews.llvm.org/D54877#inline-484380 and 
concluded retrospectively that `auto` was fine.

>
> There are other such example, for example I believe we can assume that 
> all the standard STL algorithm are known and so in this case:
>  llvm::all_of(Container, [] (const auto &Value) { ...})
> The use of auto in the lambda should be OK (assuming c++14 and the 
> type of Container is obvious).

If you require that something about the type (either the type itself or 
the concept - ie being a container) or the type passed to the lambda be 
obvious, you can choose either one. My proposal for the LLVM guideline 
is not to make both the container and the lambda arg `auto`. But you 
could choose one or the other. I think you're agreeing with that.

>
>
>     For me that means I'm not able to get my clang-query features
>     (https://steveire.wordpress.com/2018/11/11/future-developments-in-clang-query/)
>     upstream because I'm getting reviews which say "remove auto, here
>     and everywhere in this file." in
>
>
>     https://reviews.llvm.org/D54408?id=173770#inline-480255
>
>
>     That's a bit of a difficult review comment, given the ways it is
>     already used throughout the code.
>
>
>>     It seems like there is widespread enough acknowledgement that the current state of things is broken, but there is no concrete proposal for a coding standards change.  Please prepare a patch so we can discuss it.
>
>
>     I made a proposal in my initial mail in this thread. See the end
>     of the email:
>     https://lists.llvm.org/pipermail/llvm-dev/2018-November/127953.html
>
>     Phab is not well-suited to discussion like this, so we should
>     probably keep it on the mailing list for now.
>
>
>     But, here's some updated thinking:
>
>
>     New guidelines should
>
>     * Be easy to follow
>     * Have some consistency
>     * Be modern
>     * Be welcoming (or at least non-hostile) to newcomers
>     * Standardize existing practice in LLVM
>     * Achieve a consensus of support about the spirit of the guideline
>     (consensus is not unanimity) and be acceptable to people who
>     dislike auto
>
>     Can we agree on that much to start?
>
>
> It depends what you mean by "standardize existing practice in LLVM", 
> this seems like a "guideline" to define the new guideline, more than a 
> rule. I.e. if some area are "abusing" auto for example, this is not 
> automatically a reason to standardize, this may just be an indication 
> that some cleanup is needed there.

Certainly! I tried to quantify the existing use of `auto` a bit in a 
previous email. Can you do a more-useful quantification?

>
>     On the Phab review, some people expressed that they liked the
>     examples of when auto is acceptable. Here is an updated attempt at
>     guideline text for that section:
>
>
>
>     ```
>     Some are advocating a policy of "almost always ``auto``" in C++11,
>     however LLVM
>     uses a more moderate stance. Don't "almost always" use ``auto``,
>     but it may be used where either the Concept or the type is obvious
>     from the context. Here are
>     some cases where use of ``auto`` would make sense:
>
>     * Where the type appears elsewhere in the line (eg a dyn_cast)
>     * Where the variable name or initializing expression provides
>     enough information (`auto SR = getSourceRange()`)
>     * Where the context makes the *Concept* obvious, even if the type
>     is not obvious (eg, where the instance is only used as an
>     iterator, or in an algorithm as a container-like concept, or only
>     with a validity check, or an AST Matcher).
>
>
> The later point isn't clear to me: even if you're only using the 
> instance as an iterator, I may want to know what types the iterator is 
> actually iterating on.

Hmm, the existing guideline says that these iterators are ok as `auto`. 
I think you're saying that if there's a reason to have an exception for 
a particular iterator then you can have an exception for it. I agree 
with that! Are you thinking of some exceptional case?

> I'm not saying that the idea you have here is not desirable, just that 
> the language used does not help me visualize what is / isn't OK: it 
> does not fit your first criteria "Be easy to follow".
>
>
>     Exceptions may arise, but they should only arise in exceptional
>     cases. If the case is not exceptional, apply the guidelines in
>     review discussion.
>
>
> This last sentence seems general to the full document rather than this 
> section?

Yes, I'm just trying to satisfy people who might be worried :).

>     ```
>
>
>     The most important thing here is that it does not accept your
>     proposal that 'the type must be obvious'. Instead, it recognizes
>     that `auto` is really "an unspecified concept" - unspecified only
>     because we can't specify the concept in C++ yet.
>
>     However, the point/concern seems to be that readers of code should
>     know the instance may be used in its local scope.
>
>     That's why these guidelines would allow `auto Ctors` in
>
>
>     llvm::Optional<std::pair<std::string, MatcherCtor>>
>      getNodeConstructorType(ASTNodeKind targetType) {
>        const auto &Ctors = RegistryData->nodeConstructors();
>        auto It = llvm::find_if(
>            Ctors, [targetType](const NodeConstructorMap::value_type
>     &Ctor) {
>              return Ctor.first.isSame(targetType);
>            });
>        if (It == Ctors.end())
>          return llvm::None;
>        return It->second;
>      }
>
>     because `Ctors` is obviously the Container *concept*
>
> It is only obvious after your read the following, but more 
> importantly: why use auto here? It wouldn't hurt to write: 
>  const NodeConstructorMap &Ctors = RegistryData->nodeConstructors();

Why write `NodeConstructorMap`? That's not a type. If the reason for not 
using `auto` is that "the type must be obvious", then surely you're 
obliged to expand typedefs? `NodeConstructorMap` could be a std::vector 
for all you know...

The only thing you care about is that it's a container. It's a short 
function that uses the variable with `llvm::find_if`. Both the shortness 
and the use with `find_if` are relevant. This thing is a container, and 
I can't think of a container where the precise type matters in a small 
algorithmic function like this.

But the *Container-ness* of it does matter. And it is obvious. That's 
the concept that is obvious, so it should be allowed by the guidelines 
and should pass review.

> I.e. auto does not make the code more readable to me in this case.

`NodeConstructorMap` doesn't "make the type obvious" either.

> On the other hand, if the type of Ctors is explicit, the lambda 
> argument type could be auto to me (it becomes obvious from the local 
> context and the use of llvm::find_if).

Fair enough - I agree that this is a consequence of the guideline I 
propose, as I wrote above.

>     , and knowing exactly what type it is is not necessary in the
>     local context.
>
>     However, in the below code it would probably not be ok because
>     we're calling methods on the instance which opens up more
>     possibilities (is it a base interface? etc):
>
>
>     void SomeClass::foo(int input)
>     {
>         auto Ctors = getCtors(input);
>
>         m_widgets = Ctors->calculate();
>     }
>
>     Here, `Ctors` is definitely not a Container concept, we don't know
>     what kind of concept it is, so we should know the type by seeing
>     it typed in the code.
>
>     Another example from earlier in the thread:
>
>
>      template <typename BaseT, typename DerivedT>
>      void registerIfNodeMatcher(...) {
>        auto NodeKind = ASTNodeKind::getFromNodeKind<DerivedT>();
>        if (!NodeKind.isNone())
>     NodeCtors[NodeKind] = std::make_pair(MatcherName, Descriptor);
>      }
>
>     Here, `NodeKind` is used as an Optional (or Maybe) *concept*. All
>     we do is a validity check. So, `auto` should be allowed here.
>
>     This 'the concept must be obvious' guideline is also what allows
>     the use of `auto` for iterators.
>
>
>     What do people think of "Either the Concept or the type should be
>     obvious from the context" as a baseline guideline?
>
> You forgot to add "and knowing exactly what type it is is not 
> necessary in the local context" after "Concept", it seems that this is 
> necessary for your definition.

Correct! Thanks for the clarification. I mentioned earlier in the thread 
various things about use of `auto` being ok where the type is 
irrelevant, but indeed I didn't put that in the proposed guideline and 
your clarification is useful.

>
> I'm still fairly unconvinced, because the concept of "concept" seems 
> too fuzzy to be applicable in such a guideline. What is a "Concept" 
> other than a class that honor an API? How is your previous example " 
> m_widgets = Ctors->calculate();" not just obvious that Ctors is an 
> instance of a Concept that "can calculate a widget"?

Indeed, but if you go down that path, you arrive at AAA, and I don't 
want to propose that. I said that in the first line of this thread, but 
expressing the reason for that in terms of Concepts clarifies the reason 
for that in my mind. So, thanks for the discussion!

There are more general concepts, such as Iterator (ie if you see this 
being used in a raw for loop or to compare with the result of an 
algorithm it is not surprising), Container/Range (ie, if this is passed 
directly to a range-capable algorithm such as find_if, it is not 
surprising), Optional/Maybe (if all we do in the local context of the 
`auto var` is check for validity and then pass the variable to something 
else - a function, a container etc - that is not surprising).

Isn't there something more 'core' about those than 'has a foo() method 
returning int'? Don't they occur very often?

Thanks,

Stephen.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190105/fa203b4d/attachment-0001.html>