[clang] [clang-format] Support of TableGen formatting. (PR #76059)

via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 20 06:02:24 PST 2023


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang-format

Author: H.Nakamura (hnakamura5)

<details>
<summary>Changes</summary>

Currently, TableGen has its language style but the it does not works well. This patch adds total support of TableGen formatting including the support for the code (multi line string), DAG args, bang operators, the cond operator, and the paste operators.

## Options

### TableGenAllowBreakBeforeInheritColon (Boolean)

Allows break before the colon of inheritance.

```
def Def
    : Parent {};
```

By default this is false.

### TableGenAllowBreakAfterInheritColon (Boolean)

Allows break after the colon of inheritance.

```
def Def :
    Parent {};
```

By default this is true.

### TableGenBreakInsideCondOperator (Boolean)

Insert the line break after each case of !cond operator.

```
  let CondOpe1 = !cond(!eq(size, 1): 1,
                       !eq(size, 16): 1,
                       true: 0);
```

By default this is true.

### TableGenBreakInsideDAGArgList (Boolean)

Insert the line break after each element of DAGArg list.

```
  let DAGArgIns = (ins
      i32:$src1,
      i32:$src2
  );
```

By default this is false.

### TableGenPreferBreakInsideSquareBracket (Boolean)

For whom likes such a style.

```
def Def : Parent<"Def",[
    a, b, c
]> {
  ...
}
```

By default this is false.

### TableGenSpaceAroundDAGArgColon (Boolean)

Insert the space around the colon inside a DAGArg list.

```
  let DAGArgIns = (ins i32 : $src1, i32 : $src2);
```

By default this is false.

### TableGenBreakingDAGArgOperators (List of Strings)

Works only when TableGenBreakInsideDAGArgList is true.
The list needs to be consists of identifiers.
If any identifier is specified, this limits the effect of TableGenBreakInsideDAGArgList only on DAGArgs beginning with the specified identifiers.

For example the configuration,

```
TableGenBreakingDAGArgOperators: ['ins', 'outs']
```

makes the line break only occurs inside DAGArgs beginning with the specified identifiers 'ins' and 'outs'.

```
let DAGArgIns = (ins
    i32:$src1,
    i32:$src2
);

let DAGArgOthers = (others i32:$other1, i32:$other2);

let DAGArgBang = (!cast<SomeType>("Some") i32:$src1, i32:$src2)
```

### AlignConsecutiveTableGenCondOperatorColons

Supports AlignConsecutiveStyle configuration.
Align the colons inside !cond operators.

```
let CondOpe1 = !cond(!eq(size, 1) : 1,
                     !eq(size, 16): 1,
                     true         : 0);
```

### AlignConsecutiveTableGenBreakingDAGArgColons

Supports AlignConsecutiveStyle configuration.
This works only when TableGenBreakInsideDAGArgList is true and the DAGArg is not excepted by TableGenBreakingDAGArgOperators's effect.
Align the colon inside DAGArg which have line break inside.

```
let dagarg = (ins
    a  :$src1```
    aa :$src2
    aaa:$src3
)
```

### AlignConsecutiveTableGenDefinitions

Supports AlignConsecutiveStyle configuration.
This aligns the inherits colons of consecutive definitions.

```
def Def       : Parent {}
def DefDef    : Parent {}
def DefDefDef : Parent {}
```


---

Patch is 68.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76059.diff


12 Files Affected:

- (modified) clang/include/clang/Format/Format.h (+47) 
- (modified) clang/lib/Format/ContinuationIndenter.cpp (+16-2) 
- (modified) clang/lib/Format/Format.cpp (+29) 
- (modified) clang/lib/Format/FormatToken.h (+98) 
- (modified) clang/lib/Format/FormatTokenLexer.cpp (+141) 
- (modified) clang/lib/Format/FormatTokenLexer.h (+6) 
- (modified) clang/lib/Format/TokenAnnotator.cpp (+432-4) 
- (modified) clang/lib/Format/UnwrappedLineParser.cpp (+40-16) 
- (modified) clang/lib/Format/WhitespaceManager.cpp (+27-4) 
- (modified) clang/lib/Format/WhitespaceManager.h (+14) 
- (modified) clang/unittests/Format/FormatTestTableGen.cpp (+307) 
- (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+50) 


``````````diff
diff --git a/clang/include/clang/Format/Format.h b/clang/include/clang/Format/Format.h
index 8604dea689f937..30a38aed99866e 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -396,6 +396,36 @@ struct FormatStyle {
   /// \version 17
   ShortCaseStatementsAlignmentStyle AlignConsecutiveShortCaseStatements;
 
+  /// Style of aligning consecutive TableGen cond operator colons.
+  /// \code
+  ///   !cond(!eq(size, 1) : 1,
+  ///         !eq(size, 16): 1,
+  ///         true         : 0)
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenCondOperatorColons;
+
+  /// Style of aligning consecutive TableGen DAGArg operator colons.
+  /// Intended to be used with TableGenBreakInsideDAGArgList
+  /// \code
+  ///   let dagarg = (ins
+  ///       a  :$src1,
+  ///       aa :$src2,
+  ///       aaa:$src3
+  ///   )
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenBreakingDAGArgColons;
+
+  /// Style of aligning consecutive TableGen def colons.
+  /// \code
+  ///   def Def       : Parent {}
+  ///   def DefDef    : Parent {}
+  ///   def DefDefDef : Parent {}
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenDefinitions;
+
   /// Different styles for aligning escaped newlines.
   enum EscapedNewlineAlignmentStyle : int8_t {
     /// Don't align escaped newlines.
@@ -3037,6 +3067,7 @@ struct FormatStyle {
   bool isProto() const {
     return Language == LK_Proto || Language == LK_TextProto;
   }
+  bool isTableGen() const { return Language == LK_TableGen; }
 
   /// Language, this format style is targeted at.
   /// \version 3.5
@@ -4656,6 +4687,15 @@ struct FormatStyle {
   /// \version 8
   std::vector<std::string> StatementMacros;
 
+  /// Tablegen
+  bool TableGenAllowBreakBeforeInheritColon;
+  bool TableGenAllowBreakAfterInheritColon;
+  bool TableGenBreakInsideCondOperator;
+  bool TableGenBreakInsideDAGArgList;
+  bool TableGenPreferBreakInsideSquareBracket;
+  bool TableGenSpaceAroundDAGArgColon;
+  std::vector<std::string> TableGenBreakingDAGArgOperators;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4753,6 +4793,13 @@ struct FormatStyle {
            AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
            AlignConsecutiveShortCaseStatements ==
                R.AlignConsecutiveShortCaseStatements &&
+           AlignConsecutiveTableGenCondOperatorColons ==
+               R.AlignConsecutiveTableGenCondOperatorColons &&
+           AlignConsecutiveTableGenBreakingDAGArgColons ==
+               R.AlignConsecutiveTableGenBreakingDAGArgColons &&
+           AlignConsecutiveTableGenDefinitions ==
+               R.AlignConsecutiveTableGenDefinitions &&
+           AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
            AlignEscapedNewlines == R.AlignEscapedNewlines &&
            AlignOperands == R.AlignOperands &&
            AlignTrailingComments == R.AlignTrailingComments &&
diff --git a/clang/lib/Format/ContinuationIndenter.cpp b/clang/lib/Format/ContinuationIndenter.cpp
index bd319f21b05f86..176dc7744a5576 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -800,6 +800,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState &State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
       !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
       Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+      Previous.isNot(TT_TableGenDAGArgOpener) &&
       !(Current.MacroParent && Previous.MacroParent) &&
       (Current.isNot(TT_LineComment) ||
        Previous.isOneOf(BK_BracedInit, TT_VerilogMultiLineListLParen))) {
@@ -1229,7 +1230,7 @@ unsigned ContinuationIndenter::getNewLineColumn(const LineState &State) {
     return CurrentState.Indent;
   }
   if ((Current.isOneOf(tok::r_brace, tok::r_square) ||
-       (Current.is(tok::greater) && Style.isProto())) &&
+       (Current.is(tok::greater) && (Style.isProto() || Style.isTableGen()))) &&
       State.Stack.size() > 1) {
     if (Current.closesBlockOrBlockTypeList(Style))
       return State.Stack[State.Stack.size() - 2].NestedBlockIndent;
@@ -1257,6 +1258,12 @@ unsigned ContinuationIndenter::getNewLineColumn(const LineState &State) {
        Current.Next->isOneOf(tok::semi, tok::kw_const, tok::l_brace))) {
     return State.Stack[State.Stack.size() - 2].LastSpace;
   }
+  // When DAGArg closer exists top of line, it must be aligned in the similar
+  // way as function call above.
+  if (Current.is(tok::r_paren) && State.Stack.size() > 1 &&
+      (Current.isOneOf(TT_TableGenDAGArgCloser, TT_TableGenParamAngleCloser))) {
+    return State.Stack[State.Stack.size() - 2].LastSpace;
+  }
   if (Style.AlignAfterOpenBracket == FormatStyle::BAS_BlockIndent &&
       (Current.is(tok::r_paren) ||
        (Current.is(tok::r_brace) && Current.MatchingParen &&
@@ -1593,6 +1600,9 @@ unsigned ContinuationIndenter::moveStateToNextToken(LineState &State,
     State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
     State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+             State.StartOfStringLiteral == 0) {
+    State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
     State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
@@ -1672,7 +1682,9 @@ void ContinuationIndenter::moveStatePastFakeLParens(LineState &State,
         (!Previous || Previous->isNot(tok::kw_return) ||
          (Style.Language != FormatStyle::LK_Java && PrecedenceLevel > 0)) &&
         (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign ||
-         PrecedenceLevel != prec::Comma || Current.NestingLevel == 0)) {
+         PrecedenceLevel != prec::Comma || Current.NestingLevel == 0) &&
+        (!Style.isTableGen() || !Previous ||
+         !Previous->isNot(TT_TableGenDAGArgListComma))) {
       NewParenState.Indent = std::max(
           std::max(State.Column, NewParenState.Indent), CurrentState.LastSpace);
     }
@@ -1915,6 +1927,8 @@ void ContinuationIndenter::moveStatePastScopeCloser(LineState &State) {
       (Current.isOneOf(tok::r_paren, tok::r_square, TT_TemplateString) ||
        (Current.is(tok::r_brace) && State.NextToken != State.Line->First) ||
        State.NextToken->is(TT_TemplateCloser) ||
+       State.NextToken->is(TT_TableGenParamAngleCloser) ||
+       State.NextToken->is(TT_TableGenListCloser) ||
        (Current.is(tok::greater) && Current.is(TT_DictLiteral)))) {
     State.Stack.pop_back();
   }
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 28271181e07d0c..9ba3d5dbd530e7 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -926,6 +926,12 @@ template <> struct MappingTraits<FormatStyle> {
     IO.mapOptional("AlignConsecutiveMacros", Style.AlignConsecutiveMacros);
     IO.mapOptional("AlignConsecutiveShortCaseStatements",
                    Style.AlignConsecutiveShortCaseStatements);
+    IO.mapOptional("AlignConsecutiveTableGenCondOperatorColons",
+                   Style.AlignConsecutiveTableGenCondOperatorColons);
+    IO.mapOptional("AlignConsecutiveTableGenBreakingDAGArgColons",
+                   Style.AlignConsecutiveTableGenBreakingDAGArgColons);
+    IO.mapOptional("AlignConsecutiveTableGenDefinitions",
+                   Style.AlignConsecutiveTableGenDefinitions);
     IO.mapOptional("AlignEscapedNewlines", Style.AlignEscapedNewlines);
     IO.mapOptional("AlignOperands", Style.AlignOperands);
     IO.mapOptional("AlignTrailingComments", Style.AlignTrailingComments);
@@ -1124,6 +1130,20 @@ template <> struct MappingTraits<FormatStyle> {
     IO.mapOptional("StatementAttributeLikeMacros",
                    Style.StatementAttributeLikeMacros);
     IO.mapOptional("StatementMacros", Style.StatementMacros);
+    IO.mapOptional("TableGenAllowBreakAfterInheritColon",
+                   Style.TableGenAllowBreakAfterInheritColon);
+    IO.mapOptional("TableGenAllowBreakBeforeInheritColon",
+                   Style.TableGenAllowBreakBeforeInheritColon);
+    IO.mapOptional("TableGenBreakInsideCondOperator",
+                   Style.TableGenBreakInsideCondOperator);
+    IO.mapOptional("TableGenBreakInsideDAGArgList",
+                   Style.TableGenBreakInsideDAGArgList);
+    IO.mapOptional("TableGenPreferBreakInsideSquareBracket",
+                   Style.TableGenPreferBreakInsideSquareBracket);
+    IO.mapOptional("TableGenSpaceAroundDAGArgColon",
+                   Style.TableGenSpaceAroundDAGArgColon);
+    IO.mapOptional("TableGenBreakingDAGArgOperators",
+                   Style.TableGenBreakingDAGArgOperators);
     IO.mapOptional("TabWidth", Style.TabWidth);
     IO.mapOptional("TypeNames", Style.TypeNames);
     IO.mapOptional("TypenameMacros", Style.TypenameMacros);
@@ -1437,6 +1457,9 @@ FormatStyle getLLVMStyle(FormatStyle::LanguageKind Language) {
   LLVMStyle.AlignConsecutiveDeclarations = {};
   LLVMStyle.AlignConsecutiveMacros = {};
   LLVMStyle.AlignConsecutiveShortCaseStatements = {};
+  LLVMStyle.AlignConsecutiveTableGenBreakingDAGArgColons = {};
+  LLVMStyle.AlignConsecutiveTableGenCondOperatorColons = {};
+  LLVMStyle.AlignConsecutiveTableGenDefinitions = {};
   LLVMStyle.AlignTrailingComments = {};
   LLVMStyle.AlignTrailingComments.Kind = FormatStyle::TCAS_Always;
   LLVMStyle.AlignTrailingComments.OverEmptyLines = 0;
@@ -1585,6 +1608,12 @@ FormatStyle getLLVMStyle(FormatStyle::LanguageKind Language) {
   LLVMStyle.StatementAttributeLikeMacros.push_back("Q_EMIT");
   LLVMStyle.StatementMacros.push_back("Q_UNUSED");
   LLVMStyle.StatementMacros.push_back("QT_REQUIRE_VERSION");
+  LLVMStyle.TableGenAllowBreakAfterInheritColon = true;
+  LLVMStyle.TableGenAllowBreakBeforeInheritColon = false;
+  LLVMStyle.TableGenBreakInsideCondOperator = true;
+  LLVMStyle.TableGenBreakInsideDAGArgList = false;
+  LLVMStyle.TableGenPreferBreakInsideSquareBracket = false;
+  LLVMStyle.TableGenSpaceAroundDAGArgColon = false;
   LLVMStyle.TabWidth = 8;
   LLVMStyle.UseTab = FormatStyle::UT_Never;
   LLVMStyle.VerilogBreakBetweenInstancePorts = true;
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index 3f9664f8f78a3e..bd16567233bb22 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,23 @@ namespace format {
   TYPE(StructLBrace)                                                           \
   TYPE(StructRBrace)                                                           \
   TYPE(StructuredBindingLSquare)                                               \
+  TYPE(TableGenBangOperator)                                                   \
+  TYPE(TableGenCondOperator)                                                   \
+  TYPE(TableGenCondOperatorColon)                                              \
+  TYPE(TableGenCondOperatorComma)                                              \
+  TYPE(TableGenDAGArgCloser)                                                   \
+  TYPE(TableGenDAGArgListColon)                                                \
+  TYPE(TableGenDAGArgListColonToAlign)                                         \
+  TYPE(TableGenDAGArgListComma)                                                \
+  TYPE(TableGenDAGArgOpener)                                                   \
+  TYPE(TableGenDAGArgOperatorID)                                               \
+  TYPE(TableGenListCloser)                                                     \
+  TYPE(TableGenListOpener)                                                     \
+  TYPE(TableGenMultiLineString)                                                \
+  TYPE(TableGenParamAngleOpener)                                               \
+  TYPE(TableGenParamAngleCloser)                                               \
+  TYPE(TableGenTrailingPasteOperator)                                          \
+  TYPE(TableGenValueSuffix)                                                    \
   TYPE(TemplateCloser)                                                         \
   TYPE(TemplateOpener)                                                         \
   TYPE(TemplateString)                                                         \
@@ -671,6 +688,8 @@ struct FormatToken {
       return true;
     if (is(TT_DictLiteral) && is(tok::less))
       return true;
+    if (is(TT_TableGenParamAngleOpener))
+      return true;
     return isOneOf(tok::l_paren, tok::l_brace, tok::l_square,
                    TT_TemplateOpener);
   }
@@ -681,6 +700,10 @@ struct FormatToken {
       return true;
     if (is(TT_DictLiteral) && is(tok::greater))
       return true;
+    if (is(TT_TableGenParamAngleCloser))
+      return true;
+    if (is(TT_TableGenListCloser))
+      return true;
     return isOneOf(tok::r_paren, tok::r_brace, tok::r_square,
                    TT_TemplateCloser);
   }
@@ -1202,6 +1225,21 @@ struct AdditionalKeywords {
     kw_verilogHashHash = &IdentTable.get("##");
     kw_apostrophe = &IdentTable.get("\'");
 
+    // TableGen keywords
+    kw_bit = &IdentTable.get("bit");
+    kw_bits = &IdentTable.get("bits");
+    kw_code = &IdentTable.get("code");
+    kw_dag = &IdentTable.get("dag");
+    kw_def = &IdentTable.get("def");
+    kw_defm = &IdentTable.get("defm");
+    kw_defset = &IdentTable.get("defset");
+    kw_defvar = &IdentTable.get("defvar");
+    kw_dump = &IdentTable.get("dump");
+    kw_include = &IdentTable.get("include");
+    kw_list = &IdentTable.get("list");
+    kw_multiclass = &IdentTable.get("multiclass");
+    kw_then = &IdentTable.get("then");
+
     // Keep this at the end of the constructor to make sure everything here
     // is
     // already initialized.
@@ -1294,6 +1332,27 @@ struct AdditionalKeywords {
          kw_wildcard,     kw_wire,
          kw_with,         kw_wor,
          kw_verilogHash,  kw_verilogHashHash});
+
+    TableGenExtraKeywords = std::unordered_set<IdentifierInfo *>({
+        kw_assert,
+        kw_bit,
+        kw_bits,
+        kw_code,
+        kw_dag,
+        kw_def,
+        kw_defm,
+        kw_defset,
+        kw_defvar,
+        kw_dump,
+        kw_foreach,
+        kw_in,
+        kw_include,
+        kw_let,
+        kw_list,
+        kw_multiclass,
+        kw_string,
+        kw_then,
+    });
   }
 
   // Context sensitive keywords.
@@ -1539,6 +1598,21 @@ struct AdditionalKeywords {
   // Symbols in Verilog that don't exist in C++.
   IdentifierInfo *kw_apostrophe;
 
+  // TableGen keywords
+  IdentifierInfo *kw_bit;
+  IdentifierInfo *kw_bits;
+  IdentifierInfo *kw_code;
+  IdentifierInfo *kw_dag;
+  IdentifierInfo *kw_def;
+  IdentifierInfo *kw_defm;
+  IdentifierInfo *kw_defset;
+  IdentifierInfo *kw_defvar;
+  IdentifierInfo *kw_dump;
+  IdentifierInfo *kw_include;
+  IdentifierInfo *kw_list;
+  IdentifierInfo *kw_multiclass;
+  IdentifierInfo *kw_then;
+
   /// Returns \c true if \p Tok is a keyword or an identifier.
   bool isWordLike(const FormatToken &Tok) const {
     // getIdentifierinfo returns non-null for keywords as well as identifiers.
@@ -1811,6 +1885,27 @@ struct AdditionalKeywords {
     }
   }
 
+  bool isTableGenDefinition(const FormatToken &Tok) const {
+    return Tok.isOneOf(kw_def, kw_defm, kw_defset, kw_defvar, kw_multiclass,
+                       kw_let, tok::kw_class);
+  }
+
+  bool isTableGenKeyword(const FormatToken &Tok) const {
+    switch (Tok.Tok.getKind()) {
+    case tok::kw_class:
+    case tok::kw_else:
+    case tok::kw_false:
+    case tok::kw_if:
+    case tok::kw_int:
+    case tok::kw_true:
+      return true;
+    default:
+      return Tok.is(tok::identifier) &&
+             TableGenExtraKeywords.find(Tok.Tok.getIdentifierInfo()) !=
+                 TableGenExtraKeywords.end();
+    }
+  }
+
 private:
   /// The JavaScript keywords beyond the C++ keyword set.
   std::unordered_set<IdentifierInfo *> JsExtraKeywords;
@@ -1820,6 +1915,9 @@ struct AdditionalKeywords {
 
   /// The Verilog keywords beyond the C++ keyword set.
   std::unordered_set<IdentifierInfo *> VerilogExtraKeywords;
+
+  /// The TableGen keywords beyond the C++ keyword set.
+  std::unordered_set<IdentifierInfo *> TableGenExtraKeywords;
 };
 
 inline bool isLineComment(const FormatToken &FormatTok) {
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index 61430282c6f88c..8c2961ff1cbefa 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,10 @@ ArrayRef<FormatToken *> FormatTokenLexer::lex() {
       // string literals are correctly identified.
       handleCSharpVerbatimAndInterpolatedStrings();
     }
+    if (Style.isTableGen()) {
+      handleTableGenMultilineString();
+      handleTableGenNumericLikeIdentifier();
+    }
     if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
       FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +276,38 @@ void FormatTokenLexer::tryMergePreviousTokens() {
       return;
     }
   }
+  if (Style.isTableGen()) {
+    if (tryMergeTokens({tok::l_square, tok::l_brace},
+                       TT_TableGenMultiLineString)) {
+      Tokens.back()->Tok.setKind(tok::string_literal);
+      return;
+    }
+    if (Tokens.size() > 1 && Tokens.end()[-2]->is(tok::exclaim)) {
+      if (Tokens.back()->is(tok::identifier) ||
+          (Tokens.back()->is(tok::kw_if) && Tokens.back())) {
+      }
+    }
+    if (tryMergeTokens({tok::exclaim, tok::identifier},
+                       TT_TableGenBangOperator)) {
+      Tokens.back()->Tok.setKind(tok::identifier);
+      if (Tokens.back()->TokenText == "!cond")
+        Tokens.back()->setType(TT_TableGenCondOperator);
+      return;
+    }
+    if (tryMergeTokens({tok::exclaim, tok::kw_if}, TT_TableGenBangOperator)) {
+      // Here, "! if" becomes "!if". ! captures if even if the space exists.
+      // That is surely is a only one possibility in TableGen's syntax.
+      return;
+    }
+    if (tryMergeTokens({tok::plus, tok::numeric_constant}, TT_Unknown)) {
+      Tokens.back()->Tok.setKind(tok::numeric_constant);
+      return;
+    }
+    if (tryMergeTokens({tok::minus, tok::numeric_constant}, TT_Unknown)) {
+      Tokens.back()->Tok.setKind(tok::numeric_constant);
+      return;
+    }
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +799,108 @@ void FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString)) {
+    // Multi line string starts with [{
+    return;
+  }
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+       Current != End; ++Current) {
+    if (*Current == ']' && PrevIsRBrace) {
+      // }] is the end of multi line string.
+      if (!FirstBreak)
+        FirstBreak = Current;
+      MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+      MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+          StringRef(Begin, FirstBreak - Begin + 1),
+          MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+      if (LastBreak) {
+        MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+            StringRef(LastBreak + 1, Current - LastBreak),
+            MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+      }
+      resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
+      return;
+    }
+    PrevIsRBrace = false;
+    if (*Current == '\n') {
+      MultiLineString->IsMultiline = true;
+      // Assure LastBreak is not equal to FirstBreak.
+      if (!FirstBreak)
+        FirstBreak = Current;
+      LastBreak = Current;
+      continue;
+    }
+    if (*Current == '}') {
+      // Memorize '}'. If ne...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/76059


More information about the cfe-commits mailing list