[clang] [clang-format] Support of TableGen tokens with unary operator like form, bang operators and numeric literals. (PR #78996)

Hirofumi Nakamura via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 23 05:04:23 PST 2024


================
@@ -276,13 +276,44 @@ void FormatTokenLexer::tryMergePreviousTokens() {
       return;
     }
   }
-  // TableGen's Multi line string starts with [{
-  if (Style.isTableGen() && tryMergeTokens({tok::l_square, tok::l_brace},
-                                           TT_TableGenMultiLineString)) {
-    // Set again with finalizing. This must never be annotated as other types.
-    Tokens.back()->setFinalizedType(TT_TableGenMultiLineString);
-    Tokens.back()->Tok.setKind(tok::string_literal);
-    return;
+  if (Style.isTableGen()) {
+    // TableGen's Multi line string starts with [{
+    if (tryMergeTokens({tok::l_square, tok::l_brace},
+                       TT_TableGenMultiLineString)) {
+      // Set again with finalizing. This must never be annotated as other types.
+      Tokens.back()->setFinalizedType(TT_TableGenMultiLineString);
+      Tokens.back()->Tok.setKind(tok::string_literal);
+      return;
+    }
+    // TableGen's bang operator is the form !<name>.
+    // !cond is a special case with specific syntax.
+    if (tryMergeTokens({tok::exclaim, tok::identifier},
+                       TT_TableGenBangOperator)) {
+      Tokens.back()->Tok.setKind(tok::identifier);
+      Tokens.back()->Tok.setIdentifierInfo(nullptr);
+      if (Tokens.back()->TokenText == "!cond")
+        Tokens.back()->setFinalizedType(TT_TableGenCondOperator);
+      else
+        Tokens.back()->setFinalizedType(TT_TableGenBangOperator);
+      return;
+    }
+    if (tryMergeTokens({tok::exclaim, tok::kw_if}, TT_TableGenBangOperator)) {
+      // Here, "! if" becomes "!if".  That is, ! captures if even when the space
+      // exists. That is only one possibility in TableGen's syntax.
+      Tokens.back()->Tok.setKind(tok::identifier);
+      Tokens.back()->Tok.setIdentifierInfo(nullptr);
+      Tokens.back()->setFinalizedType(TT_TableGenBangOperator);
+      return;
+    }
+    // +, - with numbers are literals. Not unary operators.
+    if (tryMergeTokens({tok::plus, tok::numeric_constant}, TT_Unknown)) {
+      Tokens.back()->Tok.setKind(tok::numeric_constant);
+      return;
----------------
hnakamura5 wrote:

https://llvm.org/docs/TableGen/ProgRef.html#values-and-expressions

As far as I read from the manual, TableGen does not have `+` as infix binary operator.
And as noted in the warning above, `-` is lexed as the integer's prefix rather than infix operator for range and slice.

> could we build a better set of FormatTableGen unit tests to ensure we don't cause any regressions?could we build a better set of FormatTableGen unit tests to ensure we don't cause any regressions?

I agree, and actually there is a comprehensive set of unit test for TableGen's syntax here. (Even this may be missing real examples of TableGen usage in target definition, mlir and so on.)
https://github.com/llvm/llvm-project/pull/76059/files#diff-2ce45a84684fe19d813e79bab2f732809f3544d38f344e3d2cfe23aa9216a1c8

Current pull request is separated from this PR. I'm wondering when to add the test. Because now it only recognizes tokens, and cannot format many part of that yet.


https://github.com/llvm/llvm-project/pull/78996


More information about the cfe-commits mailing list