Patch for clang-format to allow references and pointers to bind differently

Daniel Jasper djasper at google.com
Mon Dec 2 00:18:44 PST 2013


On Fri, Nov 29, 2013 at 9:40 PM, Ben Longbons <brlongbons at gmail.com> wrote:

> On Thu, Nov 28, 2013 at 1:21 AM, Daniel Jasper <djasper at google.com> wrote:
> > Well, it is also quite rare that people declare multiple uninitialized
> > pointers in one line (and it would be a questionable practice).
> It is extremely common. I checked both LLVM (1 thousand such
> declarations, sloccount 2 million)


As Dmitri pointed out, there are quite a few false positives in there.
Furthermore, I specifically mentioned uninitialized pointers. Almost all
results of your query initialize the pointers (either by "=" or in a
constructor initializer). You could do just the same with references and in
fact there are several instances of multiple reference-type variables in
one declaration.

and Linux (10 thousand / 10
> million)


Isn't Linux a C codebase? That makes this is a non-issue for Linux and also
explains the usage pattern (see below).


> using the following regex, which gives a few false positives
> but not many:
> [^/]\*.*, \*
> The proportion of about 1/1000 seems to be relatively constant, and
> caused by the relative lack of *need* for multiple pointers even among
> the few lines of code that are declarations, rather than an aberrant
> style.
>
> > My key argument against this style would be that it suggests that "*" and
> > "&" have a different binding behavior, which I don't think they have.
> Sure, they don't have differences at the syntax level. But they *do*
> have significantly different usage patterns.
>

I have doubts about this. And if they have this difference in usage
patterns in your codebase, I'd personally consider that bad practice in
C++. Leaving uninitialized pointers lying around is just bound to lead to
bugs. Yes, back in the day (in C) when you needed to declare all variables
at the beginning of a function, this might been a "good" and "benign" thing
to do, but I don't think it is for C++.

> Well, it is not quite as easy:
> > 1. I am responsible for maintaining clang-format. Ensuring that all
> > combinations of options do the right thing gets significantly harder with
> > more options.
> I don't think this is an issue with the patch, since all of the
> operations that it does now were done already, only the option checked
> changed.
>

I estimate that less than 10% of the effort that this change will need are
done at this point. Aside from needing decent unit testing (your patch has
none), someone needs to attend to the bug reports that the "nastier cases"
will create (and of course there are edge cases, e.g. do you format
vector<int &> or vector<int&>?) and other formatting options get more
difficult to implement correctly (e.g. someone else is proposing to support
astyle's pointer style "middle").

> 2. Need to churn through pages and pages of flag documentation to
> > find the important flags makes it harder to use.
> I expect that most people will either start with one of the predefined
> styles, or try to go through every option...
>
> However, it would be useful if the documentation for the two bits
> linked to each other.
>

That is one thing. However, if you take a look at astyle's way of
configuring this is that by default reference variables are attached the
same way as pointers are (style equals "none"). I think this is a necessary
choice as the vast majority of users will want the same behavior for both.
So, by the end, we'll have two flags with several non-trivial values.


> That said - I found the documentation for clang-format rather lacking,
> since it just shows all the options in alphabetical order. It would be
> better if it were separated into logical areas (indent, mandatory
> lines, spaces, penalties and breaking lines) and *every* option had
> examples of each possible value.
>

I totally agree and patches to the documentation are more than welcome.
However, even with perfect documentation, fewer flags make usage easier.

 > 3. There are style choices that are bad or chosen for bad reasons. I
> think I
> > would put this one into this category. Now, if we support that style
> choice
> > in clang-format we ingrain it even further. If we don't (but clang-format
> > still delivers good value) we might convince a few people/projects to
> move a
> > way from this style.
> Yes, my reasons were relatively weak and if I were to do it again, I
> would probably do it differently - if only to avoid trouble with
> formatting tools.
>

So how about re-doing it now. clang-format can fix all instances for you
quite easily. Yes, I know that there are additional costs (code churn,
losing some of the blame history, ..). If clang-format provides enough of a
carrot for some projects to abandon unfortunate style choices, I consider
it a win.

But it's not just an issue of forcing people to abandon their chosen
> style - it's an issue of whether they use a formatter at all.
>

I am aware. However, if we get an additional 1% of users at the cost of
making clang-format 1% less usable for the rest of the users, this is not
an improvement. We don't have these numbers (I would think they are both
far below 1%), so we can't really make an educated decision.

In addition, you are not really convinced of the style yourself. And I
don't think we should carry forward every formatting choice that anyone has
ever done.

So, my opinion remains that this is not something we want to support in
clang-format. I am happy for others to weigh in, though.

 > Alp: Yes, I agree that "attach" or "align" (which astyle uses) would be
> > better names for this flag. But I am waiting to resolve a few questions
> > around this (e.g. there is also a request to support astyle's "middle"
> > alignment). We basically have to be backwards compatible, so I'd like to
> > change flags as seldomly as possible.
> There is one backward-incompatibility with my patch as written: if the
> user explicitly specified PointerBindsToType instead of getting it
> from the BasedOnStyle.
>
> This would actually be a good opportunity to change the name of the
> option, with PointerBindsToType setting both PointerAttachToType and
> ReferenceAttachToType and possibly giving a deprecation warning.
>
> That said ... since there *will* be other options that need to be
> split in the future, you need a general plan, not a specific one.
>
>
> For reference, these are the options that astyle supports:
> pointers: unchanged (default), type, center, value
> references: like pointers (default), unchanged, type, center, value
>
> I haven't yet tested to see what exactly astyle does for some of the
> nastier cases.
>
>
> Implementing infrastructure like "do everything unchanged where
> possible" or


This is fundamentally not how clang-format operates. It does this in few
easy cases, but even those are proving to be difficult over and over again..


> "generate the config file by --detect-style" is beyond
> me.
>

Yeah, this is a whole separate project, if you ask me ..


>
> -o11c
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131202/81274f63/attachment.html>


More information about the cfe-commits mailing list