[clang] [clang-format] Support of TableGen identifiers beginning with a number. (PR #78571)

Emilia Kond via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 18 05:31:15 PST 2024


================
@@ -804,6 +806,46 @@ void FormatTokenLexer::handleTableGenMultilineString() {
       FirstLineText, MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
 }
 
+void FormatTokenLexer::handleTableGenNumericLikeIdentifier() {
+  FormatToken *Tok = Tokens.back();
+  // TableGen identifiers can begin with digits. Such tokens are lexed as
+  // numeric_constant now.
+  if (Tok->isNot(tok::numeric_constant))
+    return;
+  StringRef Text = Tok->TokenText;
+  // Identifiers cannot begin with + or -.
+  if (Text.size() < 1 || Text[0] == '+' || Text[0] == '-')
+    return;
+  // The following check is based on llvm::TGLexer::LexToken.
+  if (isdigit(Text[0])) {
+    size_t I = 0;
+    char NextChar = (char)0;
+    // Identifiers in TalbleGen may begin with digits. Skip to first non-digit.
+    do {
+      NextChar = Text[I++];
+    } while (I < Text.size() && isdigit(NextChar));
+    // All the characters are digits.
+    if (I >= Text.size())
+      return;
+    // Base character. But it does not check the first 0 and that the base is
+    // the second character.
----------------
rymiel wrote:

Is the "it does not check" a behaviour of this implementation or also of the tablegen compiler implementation?
I noticed below in the tests you mention the lexer errors on some cases, is this logic meant to match that behaviour?

https://github.com/llvm/llvm-project/pull/78571


More information about the cfe-commits mailing list