[PATCH] D40310: Restructure how we break tokens.
Manuel Klimek via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Nov 23 06:32:52 PST 2017
klimek added inline comments.
================
Comment at: lib/Format/BreakableToken.cpp:198
+ "Getting the length of a part of the string literal indicates that "
+ "the code tries to reflow it.");
+ return UnbreakableTailLength + Postfix.size() +
----------------
krasimir wrote:
> How about clients that explicitly pass `Length = Line.size() - Offset`?
That is different (I now also went and updated the comment for getRangeLength to explain that).
Generally, Length == Line.size() - Offset is still a portion of the content, as opposed to npos, which has a special meaning. I'm wondering whether I should just pull out a differently named method for it, now that I'm thinking about it.
================
Comment at: lib/Format/BreakableToken.h:100
+ /// \brief Returns the number of columns required to format the range at bytes
+ /// \p Offset to \p Offset \c + \p Length.
+ ///
----------------
krasimir wrote:
> Does this include the byte `Offset + Length`?
Reworded.
================
Comment at: lib/Format/BreakableToken.h:107
///
- /// Note that previous breaks are not taken into account. \p TailOffset is
- /// always specified from the start of the (original) line.
- /// \p Length can be set to StringRef::npos, which means "to the end of line".
- virtual unsigned
- getLineLengthAfterSplit(unsigned LineIndex, unsigned TailOffset,
- StringRef::size_type Length) const = 0;
+ /// \p StartColumn is the column at which the text starts, needed to compute
+ /// tab stops correctly.
----------------
krasimir wrote:
> `text` is ambiguous here: does it refer to the content of the line or to the range defined by the offset and length?
Introduced 'text' in the first paragraph and reworded.
================
Comment at: lib/Format/BreakableToken.h:120
/// \p LineIndex, if previously broken at \p TailOffset. If possible, do not
/// violate \p ColumnLimit.
virtual Split getSplit(unsigned LineIndex, unsigned TailOffset,
----------------
krasimir wrote:
> What is `ReflownColumn` used for?
Argh, actually not called ReflownColumn in the implementations :)
Changed and commented.
================
Comment at: lib/Format/BreakableToken.h:142
/// \brief Returns a whitespace range (offset, length) of the content at
- /// \p LineIndex such that the content preceding this range needs to be
- /// reformatted before any breaks are made to this line.
+ /// \p LineIndex such that the content of the current line is reflown to the
+ /// end of the previous one.
----------------
krasimir wrote:
> Does the current line refer to the line at LineIndex?
Reworded.
================
Comment at: lib/Format/ContinuationIndenter.cpp:1504
: Style.PenaltyBreakComment;
- unsigned RemainingSpace = ColumnLimit - Current.UnbreakableTailLength;
+ // Stores whether we introduce a break anywhere in the token.
bool BreakInserted = Token->introducesBreakBeforeToken();
----------------
krasimir wrote:
> Does a reflow count as a break?
I do believe so (well, the break in the reflow counts, the reflow itself is not a break, but removing a break :)
================
Comment at: unittests/Format/FormatTest.cpp:10603
format("\"一\t二 \t三 四 五\t六 \t七 八九十\tqq\"",
getLLVMStyleWithColumns(11)));
----------------
krasimir wrote:
> What happened here?
It was wrong, it's now correct :)
Those characters are printed in double-width, and llvm's encoding library correctly returns this. Reflowing previously did not correctly count that, but used the character count instead.
At offset 8 (after a \t), C " pokes one column over the limit (C == double-width character, space == 1, " == 1 -> 4 -> 12 columns.
================
Comment at: unittests/Format/FormatTestComments.cpp:1104
+ " * doesn't fit on\n"
+ " * one line. */",
format("/* "
----------------
krasimir wrote:
> I think this is desirable. The jsdoc folks really wanted to fix-up the `*/` at the end. I think it has to do with `DelimitersOnNewline`.
> If this already works in Java and js mode, that could be good enough.
I don't know whether it does - I only know I didn't break any tests :)
================
Comment at: unittests/Format/FormatTestComments.cpp:2108
+ // FIXME: This assumes we do not continue compressing whitespace once we are
+ // in reflow mode. Consider compressing whitespace.
+
----------------
krasimir wrote:
> I thinks that we should be compressing whitespace in reflow mode too.
Correct, but in a different patch - this patch is a strict improvement regarding whitespace compression, and adding more would make it more complex.
================
Comment at: unittests/Format/FormatTestComments.cpp:2149
+ // to keep this?
+ EXPECT_EQ("// some text\n"
+ "// that\n"
----------------
krasimir wrote:
> This is like this for cases like lists in comments:
> ```
> blah-blah-blah:
> 1. blah
> 2. blah-blah
> ```
> I think here the block comments behavior might be wrong.
Note that on the doc I shared you voted the reverse ;)
https://reviews.llvm.org/D40310
More information about the cfe-commits
mailing list