[PATCH] D33589: clang-format: consider not splitting tokens in optimization

Tue Jun 6 04:59:48 PDT 2017

Typz added a comment.

In https://reviews.llvm.org/D33589#769893, @krasimir wrote:

> I think that what you're trying to solve is not practically that important, is unlikely to improve the handling of comments, and will add a lot of complexity.

Not sure the 'approach' I have in this patch is nice, but it does solve my problem: it is quite simple, and avoids reflowing (and actually, sometimes breaking, since comments are not so structured...) the comments.
e.g. if I have a line followed by a relatively long comment, with my patch it will really consider the penaltys properly, and not split a line which is slightly longer [with ColumnLimit=30] :

  int a; //< a very long comment

vs

  int a; //< a very long
         //< comment

> From a usability perspective, I think that people are happy enough when their comments don't exceed the line limit. I personally wouldn't want the opposite to happen. I've even seen style guides that have 80 columns limit for comments and 100 for code.

That is a matter of taste and coding style: I prefer overflowing by a few characters instead of introducing an extra line... I see no reason to allow PenaltyExcessCharacter

> Comments are treated in a somewhat ad-hoc style in clang-format, which is because they are inherently free text and don't have a specific format. People just expect comments to be handled quite differently than source code. There are at least three separate parts in clang-format that make up the formatting of comments: the normal parsing and optimization pipeline, the BreakableLineCommentSection / BreakableBlockComment classes that are responsible for breaking and reflowing inside comment sections, and the consecutive trailing comment alignment in the WhitespaceManager. This patch takes into account the first aspect but not the consequences for the other aspects: for example it allows for the first line comment token in a multiline line comment section to get out of the column limit, but will reflow the rest of the lines. A WhitespaceManager-related issue is that because a trailing line comment for some declaration might not get split, it might not be aligned with the surrounding trailing line comments, which I think is a less desirable effect.

Not sure I understand the case you are speaking about, can you please provide an exemple?

(on a separate topic, I could indeed not find a penalty for breaking alignment; I think the optimizer should account for that as well... any hint on where to look for adding this?)

> Optimizing the layout in comment sections by using the optimizing formatter sounds good, but because comments mostly contain free text that can be structured in unexpected ways, I think that the ad-hoc approach works better. Think of not destroying ASCII art and supporting bulleted and numbered lists for example. We don't really want to leak lots of comment-related formatting tweaks into the general pipeline itself.

Good, one less choice :-)

> I see you point that PenaltyBreakComment and PenaltyExcessCharacter are not taken into account while comment placement, but they are taken into account at a higher level by treating the whole comment section as a unit. For example, if you have a long declaration that has multiple potential split points followed by a long trailing comment section, the penalty induced by the number of lines that are needed and the number of unbroken characters for the formatting of the comment section is taken into account while optimizing the layout of the whole code fragment.

If you reduce PenaltyExcessCharacter you see that comments can never overflow: the optimizer will consider splitting the comments or splitting the remaining of the code, but will (currently) not allow the comment to overflow just a bit.

> The formatted currently supports somewhat limited version of allowing comment lines exceeding the column limit, like long hyperlinks for example. I think that if there are some other examples which are both widespread and super annoying, we may consider them separately.

This patch does not try to address these special cases, and should not change the behavior in this case.

https://reviews.llvm.org/D33589