[llvm-dev] RFC: changing variable naming rules in LLVM codebase

Fri Feb 22 12:45:59 PST 2019

> -----Original Message-----
> From: David Greene [mailto:dag at cray.com]
> Sent: Friday, February 22, 2019 2:40 PM
> To: Robinson, Paul
> Cc: asb at lowrisc.org; clattner at nondot.org; llvm-dev at lists.llvm.org;
> nd at arm.com
> Subject: Re: [llvm-dev] RFC: changing variable naming rules in LLVM
> codebase
> 
> via llvm-dev <llvm-dev at lists.llvm.org> writes:
> 
> > Looking at names of "variables" there's reasonable support for making
> > them visually more distinct from other kinds of names.  Regarding
> > making data-member names distinct from local variables, some people
> > don't see why it should matter, others find it helpful; given this
> > neutral-to-helpful spectrum, going with the kind-of helpful convention
> > seems preferable.
> 
> There are drawbacks to the "m_" prefix notation, though.  It makes it
> more work to move entities between class and local scope.  It's extra
> typing, it's hard to pronounce, etc.

IMO these are pretty feeble objections. The "m_" is silent, for example. 
Moving entities between class and local scope isn't all that common, and 
the small bit of extra typing provides clarity to your colleagues who 
like to know what data is ephemeral and what is persistent.  It's equally
true that using real words instead of shorthand abbreviations is more
typing, but we accept that for the clarity it brings.

> 
> Now, these may be considered trivial complaints, but given that some of
> them have already been raised, I think we need some discussion about it.
> If there's overwhelming support for "m_" (and/or "g_" etc.) then fine,
> but if there's about even opinions either way, we ought to go with the
> status quo, at least for now.

I don't think "overwhelming" is an appropriate barrier to adoption. I
hear people say "yes, this would make it easier for me to read the code"
and on the other side "meh, I don't see the benefit"; however I do *not*
see anyone saying "No, this really would interfere with my code-reading."

> 
> > So:
> >
> > - Local variables and formal parameters should be lower_case, with
> >   one exception: Variables/parameters that have lambda/function
> >   type should follow the function-name spelling rules.
> 
> "lower_case" is a pretty drastic change from CamelCase and camelCase.
> So far the only argument for it I've seen is, "lldb uses it all over the
> place."  Having one subproject drive the standard for everything else
> seems backward.  It's certainly possible I missed other opinions on this
> though.

You have. For example, here's a "significant improvement" comment:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130214.html

The debate suffers from being spread across multiple dev threads plus
a Phabricator review.  Not optimal, but that's what it is.  

> 
> > - Initialisms and other abbreviations would be considered words for
> >   this purpose, so we have names such as:
> >     tli   // Local variable for TargetLoweringInfo
> >     m_cgm // Data member for CodeGenModule
> 
> I actually think we need something stronger.  Acronyms should be
> discouraged unless it's absolutely clear what it is.  As has been
> pointed out, "tli" means at least two common and very different things:
> "TargetLoweringInfo" and "TargetLibraryInfo."

That's a separate discussion and would be something to pursue regardless
of the other spelling conventions. Let's not drag that in here.  My
inclusion of this is based on "we find these things in names today, and
they aren't 'words' so what do we do with them" not on any assessment
that they are good or bad on their own merits.

> 
> > - I don't have a good suggestion for file-static/global variables.
> >   Some people have suggested a "g_" prefix for globals, or possibly
> >   an "s_" prefix for class-static data.
> 
> Is this really helpful or does it just make the code more difficult to
> work with?  I don't know since I've never used such conventions before.
> But I'm hesitant to introduce really strong binding of scopes to names
> because it make refactoring more tedious/difficult.

I have, but in a project that had rather more global data than is good
for anyone. I put it out there for discussion, and thanks for doing
exactly that!

> 
> > Some people have worried that the churn will cause blame issues.
> > I respectfully point out that in my own archaeology I have to deal
> > with lots of clang-format/indentation/other random semantically
> > meaningless refactoring, this is just one more.  Also the point is
> > not to optimize for git-blame but to optimize for reading what is
> > there at the moment.
> 
> I want to call out that we already have a policy on this in the current
> coding standards:
> 
>   Our long term goal is for the entire codebase to follow the
>   convention, but we explicitly do not want patches that do large-scale
>   reformatting of existing code.
> 
> I could argue things either way, but we should realize that a mechanical
> reformatting goes explicitly against current stated (and presumably
> previously-agreed-upon) policy.
> 
>                                -David

Right, but my impression is that people who lived through the last update
to the spelling conventions are less enthused about a slow transition,
this time around, and that bit of the convention would necessarily also
be up for revision.

--paulr