[PATCH] D99889: [AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string.

Jonathan Crowther via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 13 12:10:02 PDT 2021


Jonathan.Crowther added inline comments.


================
Comment at: llvm/lib/MC/MCParser/AsmLexer.cpp:771-772
   switch (CurChar) {
   default:
-    if (MAI.doesAllowSymbolAtNameStart()) {
-      // Handle Microsoft-style identifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]*
-      if (!isDigit(CurChar) &&
-          isIdentifierChar(CurChar, MAI.doesAllowAtInName(),
-                           AllowHashInIdentifier))
-        return LexIdentifier();
-    } else {
-      // Handle identifier: [a-zA-Z_.][a-zA-Z0-9_$.@]*
-      if (isalpha(CurChar) || CurChar == '_' || CurChar == '.')
-        return LexIdentifier();
-    }
+    // Handle identifier: [a-zA-Z_.?][a-zA-Z0-9_$.@#?]*
+    if (isalpha(CurChar) || CurChar == '_' || CurChar == '.' ||
----------------
Previously, the Microsoft-style identifier allowed `#` (not as a starting character), but otherwise the identifier (`[a-zA-Z_.][a-zA-Z0-9_$.@]*`) did not. Should we be checking `AllowHashInIdentifier`?


================
Comment at: llvm/unittests/MC/SystemZ/SystemZAsmLexerTest.cpp:544-545
+}
+
+TEST_F(SystemZAsmLexerTest, CheckDefaultHashAtStartOfIdentifier) {
+  StringRef AsmStr = "##lj1?23";
----------------
Not sure what you're checking here that the `CheckStrictCommentString*` tests don't already check.


================
Comment at: llvm/unittests/MC/SystemZ/SystemZAsmLexerTest.cpp:557
+  lexAndCheckTokens(AsmStr, ExpectedTokens);
+}
 } // end anonymous namespace
----------------
I think there should be a test for `$` at the start of an identifier for completeness. I don't see one in `SystemZAsmLexerTest.cpp` but you may know of one somewhere else. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99889/new/

https://reviews.llvm.org/D99889



More information about the llvm-commits mailing list