[PATCH] D40310: Restructure how we break tokens.

Tue Nov 28 02:53:52 PST 2017

krasimir added inline comments.

================
Comment at: lib/Format/ContinuationIndenter.cpp:1525
+  if (!DryRun)
+    Token->adaptStartOfLine(0, Whitespaces);
+
----------------
If we indent here, shouldn't that also change ContentStartColumn?

================
Comment at: lib/Format/ContinuationIndenter.cpp:1557
         if (LineIndex < EndIndex - 1)
+          // The last line's penalty is handled in addNextStateToQueue().
           Penalty += Style.PenaltyExcessCharacter *
----------------
How does the last line's penalty get handled in addNextStateToQueue()?

================
Comment at: lib/Format/ContinuationIndenter.cpp:1565

-      // Check if compressing the whitespace range will bring the line length
-      // under the limit. If that is the case, we perform whitespace compression
-      // instead of inserting a line break.
-      unsigned RemainingTokenColumnsAfterCompression =
-          Token->getLineLengthAfterCompression(RemainingTokenColumns, Split);
-      if (RemainingTokenColumnsAfterCompression <= RemainingSpace) {
-        RemainingTokenColumns = RemainingTokenColumnsAfterCompression;
-        ReflowInProgress = true;
-        if (!DryRun)
-          Token->compressWhitespace(LineIndex, TailOffset, Split, Whitespaces);
-        DEBUG(llvm::dbgs() << "    Compressing below limit.\n");
-        break;
-      }
-
-      // Compute both the penalties for:
-      // - not breaking, and leaving excess characters
-      // - adding a new line break
-      assert(RemainingTokenColumnsAfterCompression > RemainingSpace);
-      unsigned ExcessCharactersPenalty =
-          (RemainingTokenColumnsAfterCompression - RemainingSpace) *
-          Style.PenaltyExcessCharacter;
-
-      unsigned BreakPenalty = NewBreakPenalty;
-      unsigned ColumnsUsed =
-          Token->getLineLengthAfterSplit(LineIndex, TailOffset, Split.first);
-      if (ColumnsUsed > ColumnLimit)
-        BreakPenalty +=
-            Style.PenaltyExcessCharacter * (ColumnsUsed - ColumnLimit);
-
-      DEBUG(llvm::dbgs() << "    Penalty excess: " << ExcessCharactersPenalty
-                         << "\n            break : " << BreakPenalty << "\n");
-      // Only continue to add the line break if the penalty of the excess
-      // characters is larger than the penalty of the line break.
-      // FIXME: This does not take into account when we can later remove the
-      // line break again due to a reflow.
-      if (ExcessCharactersPenalty < BreakPenalty) {
-        if (!DryRun)
-          Token->compressWhitespace(LineIndex, TailOffset, Split, Whitespaces);
-        // Do not set ReflowInProgress: we do not have any space left to
-        // reflow into.
-        Penalty += ExcessCharactersPenalty;
-        break;
+      if (!Current.isStringLiteral()) {
+        // Check whether the next natural split point after the current one
----------------
I really dislike this: we shouldn't have the reflow control flow depend on the specific type of the token. Better introduce a new virtual method that enables this branch and override it appropriately.

================
Comment at: lib/Format/ContinuationIndenter.cpp:1578
+            LineIndex, TailOffset + Split.first + Split.second, ColumnLimit,
+            ContentStartColumn + ToSplitColumns, CommentPragmasRegex);
+        // Compute the comlumns necessary to fit the next non-breakable sequence
----------------
nit: `BreakableToken::Split NextSplit = Token->getSplit(...)`

================
Comment at: lib/Format/ContinuationIndenter.cpp:1578
+            LineIndex, TailOffset + Split.first + Split.second, ColumnLimit,
+            ContentStartColumn + ToSplitColumns, CommentPragmasRegex);
+        // Compute the comlumns necessary to fit the next non-breakable sequence
----------------
krasimir wrote:
> nit: `BreakableToken::Split NextSplit = Token->getSplit(...)`
Hm, right now `ContentStartColumn + ToSplitColumns` points to the column where the character at offset `TailOffset + Split.first` is supposed to be. Then `NextSplit` is relative to the offset `TailOffset + Split.first + Split.second`, so IMO it shouldn't use `ContentStartColumn + ToSplitColumns` as a start column. I think that `ToSplitColumns` needs to be computed as follows:
```
unsigned ToSplitColumns = Token->getRangeLength(LineIndex, TailOffset, Split.first + Split.second, ContentStartColumn);
```
Also, a comment of the intended meaning of `ToSplitColumns` would be helpful.

================
Comment at: lib/Format/ContinuationIndenter.cpp:1579
+            ContentStartColumn + ToSplitColumns, CommentPragmasRegex);
+        // Compute the comlumns necessary to fit the next non-breakable sequence
+        // into the current line.
----------------
nit: `comlumns`

https://reviews.llvm.org/D40310