[PATCH] D40310: Restructure how we break tokens.

Manuel Klimek via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Nov 23 06:32:52 PST 2017


klimek added inline comments.


================
Comment at: lib/Format/BreakableToken.cpp:198
+         "Getting the length of a part of the string literal indicates that "
+         "the code tries to reflow it.");
+  return UnbreakableTailLength + Postfix.size() +
----------------
krasimir wrote:
> How about clients that explicitly pass `Length = Line.size() - Offset`?
That is different (I now also went and updated the comment for getRangeLength to explain that).

Generally, Length == Line.size() - Offset is still a portion of the content, as opposed to npos, which has a special meaning. I'm wondering whether I should just pull out a differently named method for it, now that I'm thinking about it.


================
Comment at: lib/Format/BreakableToken.h:100
+  /// \brief Returns the number of columns required to format the range at bytes
+  /// \p Offset to \p Offset \c + \p Length.
+  ///
----------------
krasimir wrote:
> Does this include the byte `Offset + Length`?
Reworded.


================
Comment at: lib/Format/BreakableToken.h:107
   ///
-  /// Note that previous breaks are not taken into account. \p TailOffset is
-  /// always specified from the start of the (original) line.
-  /// \p Length can be set to StringRef::npos, which means "to the end of line".
-  virtual unsigned
-  getLineLengthAfterSplit(unsigned LineIndex, unsigned TailOffset,
-                          StringRef::size_type Length) const = 0;
+  /// \p StartColumn is the column at which the text starts, needed to compute
+  ///    tab stops correctly.
----------------
krasimir wrote:
> `text` is ambiguous here: does it refer to the content of the line or to the range defined by the offset and length?
Introduced 'text' in the first paragraph and reworded.


================
Comment at: lib/Format/BreakableToken.h:120
   /// \p LineIndex, if previously broken at \p TailOffset. If possible, do not
   /// violate \p ColumnLimit.
   virtual Split getSplit(unsigned LineIndex, unsigned TailOffset,
----------------
krasimir wrote:
> What is `ReflownColumn` used for?
Argh, actually not called ReflownColumn in the implementations :)
Changed and commented.


================
Comment at: lib/Format/BreakableToken.h:142
   /// \brief Returns a whitespace range (offset, length) of the content at
-  /// \p LineIndex such that the content preceding this range needs to be
-  /// reformatted before any breaks are made to this line.
+  /// \p LineIndex such that the content of the current line is reflown to the
+  /// end of the previous one.
----------------
krasimir wrote:
> Does the current line refer to the line at LineIndex?
Reworded.


================
Comment at: lib/Format/ContinuationIndenter.cpp:1504
                                  : Style.PenaltyBreakComment;
-  unsigned RemainingSpace = ColumnLimit - Current.UnbreakableTailLength;
+  // Stores whether we introduce a break anywhere in the token.
   bool BreakInserted = Token->introducesBreakBeforeToken();
----------------
krasimir wrote:
> Does a reflow count as a break?
I do believe so (well, the break in the reflow counts, the reflow itself is not a break, but removing a break :)


================
Comment at: unittests/Format/FormatTest.cpp:10603
             format("\"一\t二 \t三 四 五\t六 \t七 八九十\tqq\"",
                    getLLVMStyleWithColumns(11)));
 
----------------
krasimir wrote:
> What happened here?
It was wrong, it's now correct :)
Those characters are printed in double-width, and llvm's encoding library correctly returns this. Reflowing previously did not correctly count that, but used the character count instead.
At offset 8 (after a \t), C " pokes one column over the limit (C == double-width character, space == 1, " == 1 -> 4 -> 12 columns.


================
Comment at: unittests/Format/FormatTestComments.cpp:1104
+            " * doesn't fit on\n"
+            " * one line.  */",
             format("/* "
----------------
krasimir wrote:
> I think this is desirable. The jsdoc folks really wanted to fix-up the `*/` at the end. I think it has to do with `DelimitersOnNewline`.
> If this already works in Java and js mode, that could be good enough.
I don't know whether it does - I only know I didn't break any tests :)


================
Comment at: unittests/Format/FormatTestComments.cpp:2108
+  // FIXME: This assumes we do not continue compressing whitespace once we are
+  // in reflow mode. Consider compressing whitespace.
+
----------------
krasimir wrote:
> I thinks that we should be compressing whitespace in reflow mode too.
Correct, but in a different patch - this patch is a strict improvement regarding whitespace compression, and adding more would make it more complex.


================
Comment at: unittests/Format/FormatTestComments.cpp:2149
+  // to keep this?
+  EXPECT_EQ("// some text\n"
+            "// that\n"
----------------
krasimir wrote:
> This is like this for cases like lists in comments:
> ```
> blah-blah-blah:
>   1. blah
>   2. blah-blah
> ```
> I think here the block comments behavior might be wrong.
Note that on the doc I shared you voted the reverse ;)


https://reviews.llvm.org/D40310





More information about the cfe-commits mailing list