[PATCH] D33285: clang-format: do not reflow bullet lists

Francois Ferrand via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu May 18 05:54:00 PDT 2017


Typz marked an inline comment as done.
Typz added inline comments.


================
Comment at: lib/Format/BreakableToken.cpp:313
+  // Numbered lists may also start with a number followed by '.'
+  static const char *kNumberedListPattern = "^[0-9]+\\. ";
+  hasSpecialMeaningPrefix = hasSpecialMeaningPrefix ||
----------------
krasimir wrote:
> A problem with this is that sometimes you have a sentence ending with a number, like this one, in **2016.** If this sentence also happens to just go over the column width, its last part would be reflown and during subsequent passes it will be seen as a numbered list, which is super unfortunate. I'd like us to come up with a more refined strategy of handling this case. Maybe we should look at how others are doing it?
Looking at doxygen, it seems there is no extra protection: just a number followed by a dot...
So it means:
  * We should never break before a such a sequence, to avoid the issue.
  * We may also limit the expression to limit the size of the number: I am not sure there are cases where bullet lists with hundreds of items are used, esp. with explicit values (uses the auto-numbering -# would be much simpler in that case). Maybe a limit of 2 digits? The same limit would be applied to prevent breaking before a number followed by a dot.

What do you think?


================
Comment at: lib/Format/BreakableToken.cpp:315
+  hasSpecialMeaningPrefix = hasSpecialMeaningPrefix ||
+                            llvm::Regex(kNumberedListPattern).match(Content);
+
----------------
krasimir wrote:
> This builds an `llvm::Regex` on each invocation, which is wasteful.
I did this to keep the function re-entrant; but since the code is not multi-thread I can use a static variable instead.


https://reviews.llvm.org/D33285





More information about the cfe-commits mailing list