[PATCH] D99889: [AsmParser][ms][X86] Fix possible misbehaviour in parsing of special tokens at start of string.
Jonathan Crowther via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 13 12:10:02 PDT 2021
Jonathan.Crowther added inline comments.
================
Comment at: llvm/lib/MC/MCParser/AsmLexer.cpp:771-772
switch (CurChar) {
default:
- if (MAI.doesAllowSymbolAtNameStart()) {
- // Handle Microsoft-style identifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]*
- if (!isDigit(CurChar) &&
- isIdentifierChar(CurChar, MAI.doesAllowAtInName(),
- AllowHashInIdentifier))
- return LexIdentifier();
- } else {
- // Handle identifier: [a-zA-Z_.][a-zA-Z0-9_$.@]*
- if (isalpha(CurChar) || CurChar == '_' || CurChar == '.')
- return LexIdentifier();
- }
+ // Handle identifier: [a-zA-Z_.?][a-zA-Z0-9_$.@#?]*
+ if (isalpha(CurChar) || CurChar == '_' || CurChar == '.' ||
----------------
Previously, the Microsoft-style identifier allowed `#` (not as a starting character), but otherwise the identifier (`[a-zA-Z_.][a-zA-Z0-9_$.@]*`) did not. Should we be checking `AllowHashInIdentifier`?
================
Comment at: llvm/unittests/MC/SystemZ/SystemZAsmLexerTest.cpp:544-545
+}
+
+TEST_F(SystemZAsmLexerTest, CheckDefaultHashAtStartOfIdentifier) {
+ StringRef AsmStr = "##lj1?23";
----------------
Not sure what you're checking here that the `CheckStrictCommentString*` tests don't already check.
================
Comment at: llvm/unittests/MC/SystemZ/SystemZAsmLexerTest.cpp:557
+ lexAndCheckTokens(AsmStr, ExpectedTokens);
+}
} // end anonymous namespace
----------------
I think there should be a test for `$` at the start of an identifier for completeness. I don't see one in `SystemZAsmLexerTest.cpp` but you may know of one somewhere else.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99889/new/
https://reviews.llvm.org/D99889
More information about the llvm-commits
mailing list