[cfe-commits] r139116 - in /cfe/trunk: lib/Lex/Lexer.cpp test/Lexer/bcpl-escaped-newline.c

Benjamin Kramer benny.kra at googlemail.com
Mon Sep 5 00:19:40 PDT 2011


Author: d0k
Date: Mon Sep  5 02:19:39 2011
New Revision: 139116

URL: http://llvm.org/viewvc/llvm-project?rev=139116&view=rev
Log:
Speed up BCPL comment lexing by looking aggressively for newlines and then scannig backwards to see if the newline is escaped.

3% speedup in preprocessing all of clang with -Eonly. Also includes a small testcase for coverage.

Added:
    cfe/trunk/test/Lexer/bcpl-escaped-newline.c
Modified:
    cfe/trunk/lib/Lex/Lexer.cpp

Modified: cfe/trunk/lib/Lex/Lexer.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=139116&r1=139115&r2=139116&view=diff
==============================================================================
--- cfe/trunk/lib/Lex/Lexer.cpp (original)
+++ cfe/trunk/lib/Lex/Lexer.cpp Mon Sep  5 02:19:39 2011
@@ -1635,20 +1635,28 @@
   char C;
   do {
     C = *CurPtr;
-    // FIXME: Speedup BCPL comment lexing.  Just scan for a \n or \r character.
-    // If we find a \n character, scan backwards, checking to see if it's an
-    // escaped newline, like we do for block comments.
-
     // Skip over characters in the fast loop.
     while (C != 0 &&                // Potentially EOF.
-           C != '\\' &&             // Potentially escaped newline.
-           C != '?' &&              // Potentially trigraph.
            C != '\n' && C != '\r')  // Newline or DOS-style newline.
       C = *++CurPtr;
 
-    // If this is a newline, we're done.
-    if (C == '\n' || C == '\r')
-      break;  // Found the newline? Break out!
+    const char *NextLine = CurPtr;
+    if (C != 0) {
+      // We found a newline, see if it's escaped.
+      const char *EscapePtr = CurPtr-1;
+      while (isHorizontalWhitespace(*EscapePtr)) // Skip whitespace.
+        --EscapePtr;
+
+      if (*EscapePtr == '\\') // Escaped newline.
+        CurPtr = EscapePtr;
+      else if (EscapePtr[0] == '/' && EscapePtr[-1] == '?' &&
+               EscapePtr[-2] == '?') // Trigraph-escaped newline.
+        CurPtr = EscapePtr-2;
+      else
+        break; // This is a newline, we're done.
+
+      C = *CurPtr;
+    }
 
     // Otherwise, this is a hard case.  Fall back on getAndAdvanceChar to
     // properly decode the character.  Read it in raw mode to avoid emitting
@@ -1660,6 +1668,13 @@
     C = getAndAdvanceChar(CurPtr, Result);
     LexingRawMode = OldRawMode;
 
+    // If we only read only one character, then no special handling is needed.
+    // We're done and can skip forward to the newline.
+    if (C != 0 && CurPtr == OldPtr+1) {
+      CurPtr = NextLine;
+      break;
+    }
+
     // If the char that we finally got was a \n, then we must have had something
     // like \<newline><newline>.  We don't want to have consumed the second
     // newline, we want CurPtr, to end up pointing to it down below.

Added: cfe/trunk/test/Lexer/bcpl-escaped-newline.c
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/bcpl-escaped-newline.c?rev=139116&view=auto
==============================================================================
--- cfe/trunk/test/Lexer/bcpl-escaped-newline.c (added)
+++ cfe/trunk/test/Lexer/bcpl-escaped-newline.c Mon Sep  5 02:19:39 2011
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -Eonly -trigraphs %s
+// RUN: %clang_cc1 -Eonly -verify %s
+
+//\
+#error bar
+
+//??/
+#error qux // expected-error {{qux}}
+
+// Trailing whitespace!
+//\ 
+#error quux





More information about the cfe-commits mailing list