[PATCH] D154274: [MC] Allow targets to control whether '?' can be used in identifiers

Sergei Barannikov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 30 17:36:36 PDT 2023


barannikov88 created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
barannikov88 requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

GNU assembler does not allow '?' in (unquoted) identifiers.
MC assembler allows it as an extension (D1978 <https://reviews.llvm.org/D1978>). '?' can be disabled at
the start of an identifier by setting AllowQuestionAtStartOfIdentifier
to false. This patch adds another knob, AllowQuestionInName, that
controls whether '?' can appear in the middle of an identifier.

With the knob turned off, something like "a?b" is now lexed as three
distinct tokens instead of one.

Unfortunately, there are no in-tree targets that need the new behavior,
so no tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D154274

Files:
  llvm/include/llvm/MC/MCAsmInfo.h
  llvm/include/llvm/MC/MCParser/MCAsmLexer.h
  llvm/lib/MC/MCParser/AsmLexer.cpp


Index: llvm/lib/MC/MCParser/AsmLexer.cpp
===================================================================
--- llvm/lib/MC/MCParser/AsmLexer.cpp
+++ llvm/lib/MC/MCParser/AsmLexer.cpp
@@ -33,6 +33,7 @@
 
 AsmLexer::AsmLexer(const MCAsmInfo &MAI) : MAI(MAI) {
   AllowAtInIdentifier = !StringRef(MAI.getCommentString()).startswith("@");
+  AllowQuestionInIdentifier = MAI.doesAllowAtInName();
   LexMotorolaIntegers = MAI.shouldUseMotorolaIntegers();
 }
 
@@ -145,9 +146,11 @@
 }
 
 /// LexIdentifier: [a-zA-Z_$.@?][a-zA-Z0-9_$.@#?]*
-static bool isIdentifierChar(char C, bool AllowAt, bool AllowHash) {
-  return isAlnum(C) || C == '_' || C == '$' || C == '.' || C == '?' ||
-         (AllowAt && C == '@') || (AllowHash && C == '#');
+static bool isIdentifierChar(char C, bool AllowAt, bool AllowHash,
+                             bool AllowQuestion) {
+  return isAlnum(C) || C == '_' || C == '$' || C == '.' ||
+         (AllowAt && C == '@') || (AllowHash && C == '#') ||
+         (AllowQuestion && C == '?');
 }
 
 AsmToken AsmLexer::LexIdentifier() {
@@ -157,13 +160,14 @@
     while (isDigit(*CurPtr))
       ++CurPtr;
 
-    if (!isIdentifierChar(*CurPtr, AllowAtInIdentifier,
-                          AllowHashInIdentifier) ||
+    if (!isIdentifierChar(*CurPtr, AllowAtInIdentifier, AllowHashInIdentifier,
+                          AllowQuestionInIdentifier) ||
         *CurPtr == 'e' || *CurPtr == 'E')
       return LexFloatLiteral();
   }
 
-  while (isIdentifierChar(*CurPtr, AllowAtInIdentifier, AllowHashInIdentifier))
+  while (isIdentifierChar(*CurPtr, AllowAtInIdentifier, AllowHashInIdentifier,
+                          AllowQuestionInIdentifier))
     ++CurPtr;
 
   // Handle . as a special case.
Index: llvm/include/llvm/MC/MCParser/MCAsmLexer.h
===================================================================
--- llvm/include/llvm/MC/MCParser/MCAsmLexer.h
+++ llvm/include/llvm/MC/MCParser/MCAsmLexer.h
@@ -47,6 +47,7 @@
   bool SkipSpace = true;
   bool AllowAtInIdentifier = false;
   bool AllowHashInIdentifier = false;
+  bool AllowQuestionInIdentifier = false;
   bool IsAtStartOfStatement = true;
   bool LexMasmHexFloats = false;
   bool LexMasmIntegers = false;
Index: llvm/include/llvm/MC/MCAsmInfo.h
===================================================================
--- llvm/include/llvm/MC/MCAsmInfo.h
+++ llvm/include/llvm/MC/MCAsmInfo.h
@@ -196,6 +196,9 @@
   /// Defaults to false.
   bool AllowAtInName = false;
 
+  /// This is true if the assembler allows '?' characters in symbol names.
+  bool AllowQuestionInName = true;
+
   /// This is true if the assembler allows the "?" character at the start of
   /// of a string to be lexed as an AsmToken::Identifier.
   /// If the AsmLexer determines that the string can be lexed as a possible
@@ -686,6 +689,7 @@
   unsigned getAssemblerDialect() const { return AssemblerDialect; }
   bool doesAllowAtInName() const { return AllowAtInName; }
   void setAllowAtInName(bool V) { AllowAtInName = V; }
+  bool doesAllowQuestionInName() const { return AllowQuestionInName; }
   bool doesAllowQuestionAtStartOfIdentifier() const {
     return AllowQuestionAtStartOfIdentifier;
   }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D154274.536486.patch
Type: text/x-patch
Size: 3179 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230701/d0eed68a/attachment.bin>


More information about the llvm-commits mailing list