[PATCH] Prefer breaking in places other than '(' for default style

Thu Oct 24 11:05:23 PDT 2013

On Wed, Oct 23, 2013 at 1:49 PM, Manuel Klimek <klimek at google.com> wrote:

> On Wed, Oct 23, 2013 at 10:42 PM, Eric Christopher <echristo at gmail.com>wrote:
>
>> (Beware, somewhat opinionated crap below - it is formatting and
>> indentation after all.)
>>
>> On Tue, Oct 22, 2013 at 11:18 PM, Daniel Jasper <djasper at google.com>wrote:
>>
>>> My thoughts:
>>> - "more legible" is highly subjective. I for one like clang-format's
>>> choice as it uses fewer lines and (more importantly) fewer different
>>> indents. So if I had to manually choose between the two, I would choose the
>>> former.
>>>
>>
>> I suppose, I'd have considered it more legible because it helps avoid
>> what I like to call the "wall of text" problem (not actually picking on
>> dblaikie, but it came up yesterday):
>>
>>
>>          sh_link = SectionIndexMap.lookup(Asm.getContext().getELFSection(
>>           SecName.substr(sizeof(".ARM.exidx") - 1), ELF::SHT_PROGBITS,
>>           ELF::SHF_EXECINSTR | ELF::SHF_ALLOC, SectionKind::getText(), 0,
>>           GroupName));
>>
>
> a) how would you want to layout this
>

This? Actually I'd want to get some of the variables pulled out into
temporaries if possible. This one is just hard to do anything with anyhow
(as I mentioned earlier) it just gave a wall of text example.

Though some searching for better examples does give me an idea of what I
might like... brought up the next question.

> b) why is this bad? to me it's actually very quick to read... (I'd
> probably lay it out manually exactly like that)
>
>

For me it seems to be a lot of cognitive load to just read it. After a
couple of lines without some other sort of indenting it just starts to look
like badly written perl, or line noise (same thing). Since this is also
nested calls I keep having to count parens to see where the end of each
particular call, etc is versus it being obvious from scanning the code.

Sadly I can also come up with ways that this makes the code more difficult
to read:

return dwarf::PubIndexEntryDescriptor(
        dwarf::GIEK_TYPE, CU->getLanguage() != dwarf::DW_LANG_C_plus_plus
                              ? dwarf::GIEL_STATIC
                              : dwarf::GIEL_EXTERNAL);

versus my formatting of:

return dwarf::PubIndexEntryDescriptor(dwarf::GIEK_TYPE,
                                          CU->getLanguage() !=
                                                  dwarf::DW_LANG_C_plus_plus
                                              ? dwarf::GIEL_STATIC
                                              : dwarf::GIEL_EXTERNAL);

And while maybe this all just screams that if I see what, to me, is an
unreadable wall of text it just means that the code should be factored in a
more readable fashion, I do find my preferred formatting a bit easier in
the average and don't find the more difficult above to be onerously less so.

Enh, maybe we'll just go with "if it still sucks to look at then maybe
refactor the code to be less dense and use small temporary variable names".
I don't know if that's what you guys are going for though.

-eric

>  This sort of thing is what you get with the current indenting scheme
>> (though to be fair, my patch won't fix this particular one). There's no
>> real hierarchical look going on here, and the indenting isn't helping
>> readability. I.e. if we only optimize for "fewer lines and fewer indents"
>> then we get more and more walls of text - especially when we're looking at
>> lots of nested calls with their own breaks at '('. :)
>>
>> - We have explicitly changed this for Google style and we are not
>>> changing it back. So if this change to submit this change, we need to
>>> introduce an additional style option (and I have no idea what to call it
>>> ;-) ). Maybe the best choice would be to pull out the actual penalty into
>>> something that can be configured per style. E.g.: PenaltyWrapCallAfterParen.
>>>
>>>
>> Totally up for that.
>>
>>
>>> - There is an alternative to consider. In earlier days of clang-format
>>> we used hanging identation (we still do with all other binary operators).
>>> I.e. the snippet you mention would be formatted as:
>>>
>>> CharSourceRange LineRange = CharSourceRange::getTokenRange(
>>>                                 Line.Tokens.front().Tok.getLo(),
>>>                                 Line.Tokens.back().Tok.getLoc());
>>>
>>> Now, arguably, this is preferable as it is more structured than what
>>> clang-format currently does and at the same time does not need three
>>> different indentations (which makes it look 'untidy'). That option,
>>> however, did not fly in Google style and we had to special-case all
>>> assignment expressions. Again, that is something we could change
>>> specifically for LLVM style. However, from offline discussions I assume
>>> that breaking after the opening parenthesis bothers you in more places.
>>> Just wanted to bring this back onto the table as I know Chandler is a big
>>> fan :-).
>>>
>>
>> It's definitely preferable, though as you surmise, not my preference. I
>> prefer groupings of things that are easy to see and in the "prefer breaking
>> after the ( versus breaking after the =" argument the "type and name of the
>> value being assigned into" is always going to lose to "fewer lines" :)
>>
>> FWIW I've run this patch across quite a bit of code at this point and it
>> largely comes out neutral for code size - the heuristic aspects in the rest
>> of clang format stop it from getting out of control. I think the net across
>> a bunch of files was about 4 lines. Mostly it just changes where we break,
>> it takes some special arguments or longish chains of names to actually
>> increase the indent.
>>
>> Thanks :)
>>
>> -eric
>>
>>
>>>
>>> Cheers,
>>> Daniel
>>>
>>>
>>> On Wed, Oct 23, 2013 at 2:46 AM, Eric Christopher <echristo at gmail.com>wrote:
>>>
>>>> Bah. The formatting didn't go in the email as I'd expected.
>>>>
>>>> Before:
>>>>
>>>>
>>>>     1.  CharSourceRange LineRange = CharSourceRange::getTokenRange(
>>>>    2.  Line.Tokens.front().Tok.getLo(), Line.Tokens.back().Tok.getLoc
>>>>    ());
>>>>    3.
>>>>
>>>>
>>>> After:
>>>>
>>>>
>>>>    1.  CharSourceRange LineRange =
>>>>    2.  CharSourceRange::getTokenRange(Line.Tokens.front().Tok.getLo(),
>>>>    3.  Line.Tokens.back().Tok.getLoc());
>>>>
>>>>
>>>> Whee.
>>>>
>>>> -eric
>>>>
>>>>
>>>>
>>>> On Tue, Oct 22, 2013 at 5:37 PM, Eric Christopher <echristo at gmail.com>wrote:
>>>>
>>>>> Hi Daniel,
>>>>>
>>>>> Thought I'd send it out and get some discussion if people care and
>>>>> then I can update the tests and commit if we think it's a good idea.
>>>>>
>>>>> It'll change code like in (one of the) failing tests:
>>>>>
>>>>> Value of: format(messUp(Code), Style)
>>>>>   Actual: "CharSourceRange LineRange =\n
>>>>> CharSourceRange::getTokenRange(Line.Tokens.front().Tok.getLo(),\n
>>>>>                              Line.Tokens.back().Tok.getLoc());"
>>>>> Expected: Code.str()
>>>>> Which is: "CharSourceRange LineRange =
>>>>> CharSourceRange::getTokenRange(\n    Line.Tokens.front().Tok.getLo(),
>>>>> Line.Tokens.back().Tok.getLoc());"
>>>>> [  FAILED  ] FormatTest.BreaksAfterAssignments (5 ms)
>>>>> [----------] 1 test from FormatTest (5 ms total)
>>>>>
>>>>> this:
>>>>>
>>>>> <pre style='color:#000000;background:#ffffff;'><html><body
>>>>> style='color:#000000; background:#ffffff; '><pre>
>>>>> CharSourceRange LineRange <span style='color:#808030; '>=</span>
>>>>>       CharSourceRange<span style='color:#800080;
>>>>> '>::</span>getTokenRange<span style='color:#808030;
>>>>> '>(</span>Line<span style='color:#808030; '>.</span>Tokens<span
>>>>> style='color:#808030; '>.</span>front<span style='color:#808030;
>>>>> '>(</span><span style='color:#808030; '>)</span><span
>>>>> style='color:#808030; '>.</span>Tok<span style='color:#808030;
>>>>> '>.</span>getLo<span style='color:#808030; '>(</span><span
>>>>> style='color:#808030; '>)</span><span style='color:#808030; '>,</span>
>>>>>                                      Line<span style='color:#808030;
>>>>> '>.</span>Tokens<span style='color:#808030; '>.</span>back<span
>>>>> style='color:#808030; '>(</span><span style='color:#808030;
>>>>> '>)</span><span style='color:#808030; '>.</span>Tok<span
>>>>> style='color:#808030; '>.</span>getLoc<span style='color:#808030;
>>>>> '>(</span><span style='color:#808030; '>)</span><span
>>>>> style='color:#808030; '>)</span><span style='color:#800080; '>;</span>
>>>>> </pre>
>>>>>
>>>>> versus:
>>>>>
>>>>> <pre style='color:#000000;background:#ffffff;'><html><body
>>>>> style='color:#000000; background:#ffffff; '><pre>
>>>>> CharSourceRange LineRange <span style='color:#808030; '>=</span>
>>>>> CharSourceRange<span style='color:#800080;
>>>>> '>::</span>getTokenRange<span style='color:#808030; '>(</span>
>>>>>         Line<span style='color:#808030; '>.</span>Tokens<span
>>>>> style='color:#808030; '>.</span>front<span style='color:#808030;
>>>>> '>(</span><span style='color:#808030; '>)</span><span
>>>>> style='color:#808030; '>.</span>Tok<span style='color:#808030;
>>>>> '>.</span>getLo<span style='color:#808030; '>(</span><span
>>>>> style='color:#808030; '>)</span><span style='color:#808030; '>,</span>
>>>>> Line<span style='color:#808030; '>.</span>Tokens<span
>>>>> style='color:#808030; '>.</span>back<span style='color:#808030;
>>>>> '>(</span><span style='color:#808030; '>)</span><span
>>>>> style='color:#808030; '>.</span>Tok<span style='color:#808030;
>>>>> '>.</span>getLoc<span style='color:#808030; '>(</span><span
>>>>> style='color:#808030; '>)</span><span style='color:#808030;
>>>>> '>)</span><span style='color:#800080; '>;</span>
>>>>> </pre>
>>>>>
>>>>> which while the former is more lines I think it is a much more legible
>>>>> general formatting style.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> -eric
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131024/9dac6170/attachment.html>