[clang] [clang-format] Fix an off-by-1 bug with -length option (PR #143302)

kadir çetinkaya via cfe-commits cfe-commits at lists.llvm.org
Wed Jun 25 01:33:59 PDT 2025


================
@@ -296,12 +296,16 @@ static bool fillRanges(MemoryBuffer *Code,
     }
     if (!EmptyLengths)
       Length = Lengths[I];
+    if (Length == 0) {
+      errs() << "error: length should be at least 1\n";
+      return true;
+    }
     if (Offset + Length > CodeSize) {
       errs() << "error: invalid length " << Length << ", offset + length ("
-             << Offset + Length << ") is outside the file.\n";
+             << Offset + Length << ") is outside the file\n";
       return true;
     }
-    Ranges.push_back(tooling::Range(Offset, Length));
+    Ranges.push_back(tooling::Range(Offset, Length - 1));
----------------
kadircet wrote:

hi folks, we're seeing some regressions that's possibly related to this change (working on a repro right now). I am having a hard time following why this is the right thing to do. `tooling::Range` and command line flags are both (offset, length) pairs. `offset` is 0-based for both, and `length` is again coming from the same domain for both representations (in bytes).

moreover, we're only performing this mapping here, and not just couple of lines above (when we convert `--lines` to ranges.

I do agree that there's an off by one somewhere, but I think it's more likely to be in https://github.com/llvm/llvm-project/blob/main/clang/lib/Format/AffectedRangeManager.cpp#L61-L71.
Logic in there accepts two ranges `[B1, E1)` and `[B2, E2)` as intersecting when `E1==B2`, but those ranges are half open (as they're constructed from `offset + length` pairs).

https://github.com/llvm/llvm-project/pull/143302


More information about the cfe-commits mailing list