[PATCH] D68682: format::cleanupAroundReplacements removes whole line when Removals leave previously non-blank line blank

Conrad Poelman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Dec 22 20:51:35 PST 2019


poelmanc marked 13 inline comments as done.
poelmanc added inline comments.


================
Comment at: clang/lib/Format/Format.cpp:2385
+  const char *FileText = Code.data();
+  StringRef FilePath; // Must be the same for every Replacement
+  for (const auto &R : Replaces) {
----------------
kadircet wrote:
> maybe assign `FilePath = Replaces.begin()->getFilePath()` here, and early exit in the beginning if `Replaces` is empty?
> Then you can just `assert(FilePath == R.getFilePath())`
Thank you for the detailed review of the algorithm and the suggestions, sorry it took me so long to get back to this. Done.


================
Comment at: clang/lib/Format/Format.cpp:2391
+
+    int CurrentRLineStartPos = RStartPos;
+    while (CurrentRLineStartPos > 0 &&
----------------
kadircet wrote:
> it seems like this loop is trying to find line start, would be nice to clarify with a comment.
> 
> If so seems like the logic is a little buggy, for example:
> 
> `int a^;` if the offset is at `^`(namely at `a`) then this will return current position as line start, since `a` is preceded by a whitespace.
> Maybe change the logic to find the first `\n` before current offset, and skip any whitespaces after that `\n` (if really necessary)
> 
> btw, please do explain why you are skipping the preceding whitespaces in the comment.
> 
> could you also add a test case for this one?
Good point, I clarified by refactoring this code into a simple helper function `FindLineStart`.

I don't believe there's buggy logic but the term //vertical whitespace// may have caused confusion - it includes only newline characters (linefeed & carriage returns) whereas //whitespace// includes those plus spaces and tabs. Now that the function is called `FindLineStart` the logic is hopefully clear.


================
Comment at: clang/lib/Format/Format.cpp:2398
+    assert(CurrentRLineStartPos >= LineStartPos);
+    if (CurrentRLineStartPos != LineStartPos) {
+      // We've moved on to a new line. Wrap up the old one before moving on.
----------------
kadircet wrote:
> this logic might not work in cases like:
> 
> ```
> in^t a;
> int^ b;
> ```
> 
> let's assume you've got replacements at places marked with `^` again, then they would have same start position so you would miss the line change event.
> maybe count number of `\n`s in the previous loop to detect current line number.
> 
> Also please add a test for this case as well.
I'm not sure I understand this concern; if you could draft up a specific test case I'd be happy to ensure it works.


================
Comment at: clang/lib/Format/Format.cpp:2427
+  // (b) the original line was *not* blank.
+  for (const auto &LineCheckedThrough : PotentialWholeLineCuts) {
+    const int LineStartPos = LineCheckedThrough.first;
----------------
kadircet wrote:
> the second loop seems like an over-kill and complicates the code a lot.
> 
> IIUC, it is just checking whether any text after replacements is "whitespace/newline" and if so deletes the whole line.
> Instead of doing all of this, in the previous loop, whenever we see a line change could we scan until the EOL, and delete the EOL token if we didn't hit anything but whitespaces?
> 
> this would mean you won't need to store `PotentialWholeLineCuts`.
I agree and thanks for the feedback! `PotentialWholeLineCuts` felt a bit hacky so I've eliminated it by creating a new `MaybeRemoveBlankLine` helper function that gets called in two places, eliminating the need for the second pass and the `PotentialWholeLineCuts` variable. Hopefully the code is easier to follow now.


================
Comment at: clang/unittests/Format/CleanupTest.cpp:368
+TEST_F(CleanUpReplacementsTest, RemoveLineWhenAllNonWhitespaceRemoved) {
+  std::string Code = "namespace A { // Useless comment\n"
+                     "  int x\t = 0;\t\n"
----------------
kadircet wrote:
> could you replace these with raw string literals instead to get rid of `\n`s.
I like that modernization suggestion but as a fairly new llvm contributor I follow the style I see in the surrounding code. Perhaps we should submit a seperate patch to modernize whole files at a time to use raw string literals and see how the change is received by the community.


================
Comment at: clang/unittests/Format/CleanupTest.cpp:376
+  tooling::Replacements Replaces =
+      toReplacements({createReplacement(getOffset(Code, 1, 14), 19, ""),
+                      createReplacement(getOffset(Code, 2, 3), 3, ""),
----------------
kadircet wrote:
> can you make use of `Annotations` library in `llvm/include/llvm/Testing/Support/Annotations.h` for getting offsets into the code?
> 
> same for the following test cases
`Annotations` looks quite handy, but again I'd rather get this patch through consistent with the file's existing style and submit a separate patch to modernize a whole file at a time to use `Annotations`.


Repository:
  rCTE Clang Tools Extra

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68682/new/

https://reviews.llvm.org/D68682





More information about the cfe-commits mailing list