[llvm] [SourceMgr] Clean up handling of line ending characters (PR #120605)

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 6 21:58:32 PST 2025


================
@@ -91,19 +91,32 @@ static std::vector<T> &GetOrCreateOffsetCache(void *&OffsetCache,
   size_t Sz = Buffer->getBufferSize();
   assert(Sz <= std::numeric_limits<T>::max());
   StringRef S = Buffer->getBuffer();
-  for (size_t N = 0; N < Sz; ++N) {
-    if (S[N] == '\n')
-      Offsets->push_back(static_cast<T>(N));
+
+  // The cache always includes 0 (for the start of the first line) and Sz (so
+  // that you can always index by N+1 to find the end of line N, even if the
+  // last line has no terminating newline).
+  Offsets->push_back(0);
+  for (size_t N = 0; N != Sz;) {
+    while (N != Sz && S[N] != '\n' && S[N] != '\r')
+      ++N;
+    if (N == Sz)
+      break;
+
+    // Skip over CR, LF, CRLF or LFCR.
+    ++N;
+    if (N != Sz && (S[N - 1] ^ S[N]) == ('\r' ^ '\n'))
+      ++N;
+    Offsets->push_back(static_cast<T>(N));
   }
+  Offsets->push_back(static_cast<T>(Sz));
----------------
MaskRay wrote:

I think this is not necessary. If the file doesn't end with \n, so be it.

https://github.com/llvm/llvm-project/pull/120605


More information about the llvm-commits mailing list