[PATCH] D33589: clang-format: consider not splitting tokens in optimization

Tue Nov 21 07:41:08 PST 2017

Typz added a comment.

In https://reviews.llvm.org/D33589#924716, @klimek wrote:

> One interesting trade-off I'm running into:
>  My gut feeling is that we really want to make local decisions about whether we want to break/reflow - this makes the code significantly simpler (IMO), and handles all tests in this patch correctly, but is fundamentally limiting the global optimizations we can do. Specifically, we would not correctly reflow this:
>
>   //       |< limit
>   // foo bar
>   // baz
>   // x
>
> to
>
>   // foo
>   // bar
>   // baz x
>
> when the excess character limit is low.

As I can see with your patch, local decision does not account for accumulated penalty on multi-line comment, and will thus give unexpected (e.g. no change) result when each line overlaps by a few characters, but not enough to trigger a break at this line.

> That would be a point for global optimization, but I find it really hard to understand exactly why it's ok to do it. Won't we get things like this wrong:
> 
>   Limit: 13
>   // foo  bar baz
>   // bab      bob
> 
> as we'll not compress whitespace?

Indeed, this patch would not trigger whitespace compression when not reflowing; it would compare "not doing anything" (no reflow, no whitespace compression) with the complete reflowing (including whitespace compression). I don't think that would break anything, but indeed we could possibly get even better result by trying to apply whitespace compression in the no-reflow case [which should be simple, just a bit more code at line 1376 in the version of the patch].

https://reviews.llvm.org/D33589