Patch for clang-format to allow references and pointers to bind differently

Ben Longbons brlongbons at gmail.com
Wed Dec 4 11:55:34 PST 2013


On Mon, Dec 2, 2013 at 12:18 AM, Daniel Jasper <djasper at google.com> wrote:
> On Fri, Nov 29, 2013 at 9:40 PM, Ben Longbons <brlongbons at gmail.com> wrote:
>> On Thu, Nov 28, 2013 at 1:21 AM, Daniel Jasper <djasper at google.com> wrote:
>> > Well, it is also quite rare that people declare multiple uninitialized
>> > pointers in one line (and it would be a questionable practice).
>> It is extremely common. I checked both LLVM (1 thousand such
>> declarations, sloccount 2 million)
>
>
> As Dmitri pointed out, there are quite a few false positives in there.
> Furthermore, I specifically mentioned uninitialized pointers.
Even filtering out all lines containing = or () or {}, there are still
about 367 (still a handful of false positives, but *very* few at this
point), which is the same order of magnitude.

I'm not sure how, especially since a lot of them are just  'int *a,
*b;' vs 'int *a = 0, *b = 0;'

> You could do just the same with references and in
> fact there are several instances of multiple reference-type variables in one
> declaration.
I can only find 7 uninitialized and 4 initialized in LLVM, which is
not comparable to the pointer case.

> Isn't Linux a C codebase? That makes this is a non-issue for Linux and also
> explains the usage pattern (see below).
C is very relevant. For one thing, pointers are the same even without
languages, and for another, a lot of C++ codebases are written in a
very C-like style.

>> Sure, they don't have differences at the syntax level. But they *do*
>> have significantly different usage patterns.
>
>
> I have doubts about this. And if they have this difference in usage patterns
> in your codebase, I'd personally consider that bad practice in C++.
All I'm saying is that references are not like pointers, even if they
have the same machine implementation.
There are very few cases where I have to ask "do I want a reference or
a pointer" - each case has an obvious answer.

> Leaving
> uninitialized pointers lying around is just bound to lead to bugs. Yes, back
> in the day (in C) when you needed to declare all variables at the beginning
> of a function, this might been a "good" and "benign" thing to do, but I
> don't think it is for C++.
That's why there's a warning for possible uses of uninitialized variables.

And a lot of times, there isn't a meaningful, unless you go the route
of NULL'ing everything, which I don't like because it implies that
NULL is a valid value. A lot of my code goes like:

T *p;
if (cond) {
  several steps including possible early returns;
  p = ...;
} else {
  several steps;
  p = ...;
}
several steps that require p not to be null;

> I estimate that less than 10% of the effort that this change will need are
> done at this point.
> Aside from needing decent unit testing (your patch has
> none),
Where are the existing tests for pointer attachment formatting? It
would be trivial for me to transform them, and extend them if I see
any weaknesses.

I did included all the sample outputs I could think of in the gist
version of the patch: https://gist.github.com/o11c/7506460

I just now updated it to include cases that involve templates

> someone needs to attend to the bug reports that the "nastier cases"
> will create (and of course there are edge cases, e.g. do you format
> vector<int &> or vector<int&>?) and other formatting options get more
> difficult to implement correctly
The short answer to "what does my patch do" for cases that only have a
pointer *or* a reference, but not both, is "whatever it already does".

And if you accept this patch, please feel free to CC me on any bugs
that come up.

> (e.g. someone else is proposing to support
> astyle's pointer style "middle").
I don't care for that style myself, but I'd try to implement it.

And the moment you go from 2 to 3 options, having "references default
to being like pointers" is no longer more difficult than the rest of
the necessary code.

> So how about re-doing it now. clang-format can fix all instances for you
> quite easily. Yes, I know that there are additional costs (code churn,
> losing some of the blame history, ..).
There are already two instances of churn fresh in my mind - one when
my predecessor ran GNU indent with nearly the default settings, and
once when I made the call to undo some of that styling. I'm not fond
of .

And clang-format has several other deficiencies (some of which I
mentioned earlier in this thread, some of which I'm sure I haven't
found yet because the diff was still too huge) that make it unsuitable
for my code-base yet. This was just the easiest one to fix.

> If clang-format provides enough of a
> carrot for some projects to abandon unfortunate style choices, I consider it
> a win.
The more I think about what programmers are like, I'm not sure whether
that will ever happen *at* *all*.

"My style is obviously right and if your tool doesn't support it (why
isn't it the default? Oh, must be some company/project's mandated
format that no one *really* likes), then your tool is just dumb."

> In addition, you are not really convinced of the style yourself.
I've seens convincing arguments for always-right, if I were starting a
new project.
But all the reasons I chose this style in the first place are still
valid, and in any case, churn is a big thing.

>> That said ... since there *will* be other options that need to be
>> split in the future, you need a general plan, not a specific one.
Do you have a reply to this?

>> Implementing infrastructure like "do everything unchanged where
>> possible" or
>
> This is fundamentally not how clang-format operates. It does this in few
> easy cases, but even those are proving to be difficult over and over again..
I don't disagree with that decision. But in that case, I recommend
that you immediately remove the current attempts, and then implement
--detect-style (since you can't argue that that doesn't belong in
clang-format, unlike people's rare style choices)

-o11c



More information about the cfe-commits mailing list