[clang] ff77071 - [clang][Lexer] Make raw and normal lexer behave the same for line comments

Kadir Cetinkaya via cfe-commits cfe-commits at lists.llvm.org
Mon Jan 31 07:15:22 PST 2022


Author: Kadir Cetinkaya
Date: 2022-01-31T16:15:16+01:00
New Revision: ff77071a4d672fab7c8b30bea8525b89be8596fc

URL: https://github.com/llvm/llvm-project/commit/ff77071a4d672fab7c8b30bea8525b89be8596fc
DIFF: https://github.com/llvm/llvm-project/commit/ff77071a4d672fab7c8b30bea8525b89be8596fc.diff

LOG: [clang][Lexer] Make raw and normal lexer behave the same for line comments

Normally there are heruistics in lexer to treat `//*` specially in
language modes that don't have line comments (to emit `/`). Unfortunately this
only applied to the first occurence of a line comment inside the file, as the
subsequent line comments were treated as if language had support for them.

This unfortunately only holds in normal lexing mode, as in raw mode all
occurences of line comments received this treatment, which created discrepancies
when comparing expanded and spelled tokens.

The proper fix would be to just make sure we treat all the line comments with a
subsequent `*` the same way, but it would imply breaking some code that's
accepted by clang today. So instead we introduce the same bug into raw lexing
mode.

Fixes https://github.com/clangd/clangd/issues/1003.

Differential Revision: https://reviews.llvm.org/D118471

Added: 
    

Modified: 
    clang/lib/Lex/Lexer.cpp
    clang/unittests/Lex/LexerTest.cpp

Removed: 
    


################################################################################
diff  --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index 89e89c7c1f17..a180bba365cf 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -2378,8 +2378,9 @@ bool Lexer::SkipLineComment(Token &Result, const char *CurPtr,
                             bool &TokAtPhysicalStartOfLine) {
   // If Line comments aren't explicitly enabled for this language, emit an
   // extension warning.
-  if (!LangOpts.LineComment && !isLexingRawMode()) {
-    Diag(BufferPtr, diag::ext_line_comment);
+  if (!LangOpts.LineComment) {
+    if (!isLexingRawMode()) // There's no PP in raw mode, so can't emit diags.
+      Diag(BufferPtr, diag::ext_line_comment);
 
     // Mark them enabled so we only emit one warning for this translation
     // unit.

diff  --git a/clang/unittests/Lex/LexerTest.cpp b/clang/unittests/Lex/LexerTest.cpp
index 319c63f6a50b..df22e775314a 100644
--- a/clang/unittests/Lex/LexerTest.cpp
+++ b/clang/unittests/Lex/LexerTest.cpp
@@ -23,6 +23,8 @@
 #include "clang/Lex/ModuleLoader.h"
 #include "clang/Lex/Preprocessor.h"
 #include "clang/Lex/PreprocessorOptions.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/StringRef.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 #include <memory>
@@ -632,4 +634,27 @@ TEST_F(LexerTest, CreatedFIDCountForPredefinedBuffer) {
   EXPECT_EQ(SourceMgr.getNumCreatedFIDsForFileID(PP->getPredefinesFileID()),
             1U);
 }
+
+TEST_F(LexerTest, RawAndNormalLexSameForLineComments) {
+  const llvm::StringLiteral Source = R"cpp(
+  // First line comment.
+  //* Second line comment which is ambigious.
+  )cpp";
+  LangOpts.LineComment = false;
+  auto Toks = Lex(Source);
+  auto &SM = PP->getSourceManager();
+  auto SrcBuffer = SM.getBufferData(SM.getMainFileID());
+  Lexer L(SM.getLocForStartOfFile(SM.getMainFileID()), PP->getLangOpts(),
+          SrcBuffer.data(), SrcBuffer.data(),
+          SrcBuffer.data() + SrcBuffer.size());
+
+  auto ToksView = llvm::makeArrayRef(Toks);
+  clang::Token T;
+  while (!L.LexFromRawLexer(T)) {
+    ASSERT_TRUE(!ToksView.empty());
+    EXPECT_EQ(T.getKind(), ToksView.front().getKind());
+    ToksView = ToksView.drop_front();
+  }
+  EXPECT_TRUE(ToksView.empty());
+}
 } // anonymous namespace


        


More information about the cfe-commits mailing list