[PATCH] D121758: [clang-format] Add support for formatting Verilog code

MyDeveloperDay via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 16 02:14:35 PDT 2022


MyDeveloperDay added inline comments.


================
Comment at: clang/lib/Format/ContinuationIndenter.cpp:1373
+  else if (!Current.isOneOf(tok::comment, tok::identifier) &&
+           !Keywords.isPPHash(Current, Style) && !Current.isStringLiteral())
     State.StartOfStringLiteral = 0;
----------------
I'm not a fan of these isPPHash changes as I think it makes the impact on the other code much greater

I think it would be better to handle the backtick case completely separately, in my experience as soon as you try to represent 2 things as one you are in trouble, my developer gut says there is something wrong here even if it works 99% of the time the last 1% will kill you.




================
Comment at: clang/lib/Format/FormatToken.h:723
+    if (ForcedPrecedence != prec::Unknown)
+      return ForcedPrecedence;
     return getBinOpPrecedence(Tok.getKind(), /*GreaterThanIsOperator=*/true,
----------------
Could you show the unit test that covers this, does it impact C++ at all?


================
Comment at: clang/lib/Format/FormatToken.h:737
   /// Returns the next token ignoring comments.
-  LLVM_NODISCARD const FormatToken *getNextNonComment() const {
-    const FormatToken *Tok = Next;
+  LLVM_NODISCARD FormatToken *getNextNonComment() const {
+    FormatToken *Tok = Next;
----------------
unrelated change


================
Comment at: clang/lib/Format/FormatToken.h:942
+  KEYWORD(function, ATTR_JS_KEYWORD | ATTR_CSHARP_KEYWORD |                    \
+                        ATTR_VERILOG_KEYWORD | ATTR_VERILOG_HIER)              \
   KEYWORD(get, ATTR_JS_KEYWORD | ATTR_CSHARP_KEYWORD)                          \
----------------
I feel like you are overloading a different level of classification here as to the "type of function" this just makes this structure now massive and hard to read.


================
Comment at: clang/lib/Format/FormatToken.h:1153
+  KEYWORD(with, ATTR_VERILOG_KEYWORD)                                          \
+  KEYWORD(wor, ATTR_VERILOG_KEYWORD | ATTR_VERILOG_QUALIFIER)
 
----------------
What standard are we following here? for example I notice `noshowcancelled` is not covered..

despite me actually doing some VHDL during my degress, I no nothing about this, just want to understand what we should be covering.

I guess alot of these words are not unit tested right?




================
Comment at: clang/lib/Format/FormatToken.h:1177
+    ATTR_VERILOG_KEYWORD = 0x4,
+    ATTR_VERILOG_PP_DIRECTIVE = 0x8,
+    ATTR_VERILOG_END = 0x10,
----------------
this feels like mixed metaphors as I said above


================
Comment at: clang/lib/Format/FormatToken.h:1192
   IdentifierInfo *kw_internal_ident_after_define;
+  IdentifierInfo *backtick;
+  IdentifierInfo *backtickbacktick;
----------------
I'm kind of wondering why this is needed (can you point me to the unit tests) because doesn't javascript already support this kind of interpolated string?


================
Comment at: clang/lib/Format/FormatToken.h:1324
+  /// Returns whether \p Tok is a Verilog preprocessor directive.
+  bool isVerilogPPDirective(const FormatToken &Tok) const {
+    auto Info = Tok.Tok.getIdentifierInfo();
----------------
Does this need to be VerilogSpecific?


================
Comment at: clang/lib/Format/FormatToken.h:1468
+    return Tok.is(TT_GotoLabelColon) ||
+           (Style.Language == FormatStyle::LK_Verilog &&
+            Tok.is(tok::kw_default) &&
----------------



================
Comment at: clang/lib/Format/FormatTokenLexer.cpp:84
       handleTemplateStrings();
-    }
-    if (Style.Language == FormatStyle::LK_TextProto)
+    } else if (Style.Language == FormatStyle::LK_TextProto) {
       tryParsePythonComment();
----------------
unrelated change commit separately to reduce patch size


================
Comment at: clang/lib/Format/FormatTokenLexer.cpp:471
+  if (Next->isOneOf(tok::l_brace, tok::colon, tok::comment) ||
+      Keywords.isPPHash(*Next, Style))
     return false;
----------------
year really not nice.. its like we can never use tok::hash again!


================
Comment at: clang/lib/Format/FormatTokenLexer.cpp:513
 
+bool FormatTokenLexer::tryMergeTokensAny(
+    ArrayRef<ArrayRef<tok::TokenKind>> Kinds, TokenType NewType) {
----------------
This change could be pull out separately to reduce the size of this patch


================
Comment at: clang/lib/Format/FormatTokenLexer.cpp:929
+/// Count the length of leading whitespace in a token.
+static size_t countLeadingWhitespace(StringRef Text) {
+  // Basically counting the length matched by this regex.
----------------
This is an unrelated change surely... and you point me to some unit tests for this


================
Comment at: clang/lib/Format/FormatTokenLexer.cpp:1175
 
+bool FormatTokenLexer::readRawTokenLanguageSpecific(Token &Tok) {
+  if (Style.Language != FormatStyle::LK_Verilog)
----------------
This is only for Verilog right? I think we should make that clear `readRawTokenVerilogSpecific`


================
Comment at: clang/lib/Format/TokenAnnotator.cpp:736
           CurrentToken->setType(TT_AttributeColon);
-        } else if (Left->isOneOf(TT_ArraySubscriptLSquare,
+        } else if (Style.isCpp() &&
+                   Left->isOneOf(TT_ArraySubscriptLSquare,
----------------
Could this impact any of the non C++ languages like C#


================
Comment at: clang/lib/Format/TokenAnnotator.cpp:1195
     case tok::comma:
-      if (Contexts.back().InCtorInitializer)
+      switch (Contexts.back().ContextType) {
+      case Context::CtorInitializer:
----------------
To ease the pain of such a massive patch you could have rolled this change in first, switching from the if's to a switch is a good change but you could massively reduce the impact of the patch by doing this in phases


================
Comment at: clang/lib/Format/TokenAnnotator.cpp:1416
          CurrentToken->is(Keywords.kw_package)) ||
-        (Info && Info->getPPKeywordID() == tok::pp_import &&
-         CurrentToken->Next &&
+        ((Style.isCpp() || Style.Language == FormatStyle::LK_Proto ||
+          Style.Language == FormatStyle::LK_Java) &&
----------------
Did other languages creep in?


================
Comment at: clang/lib/Format/TokenAnnotator.cpp:1869
+    } else if (Current.isOneOf(tok::exclaim, tok::tilde) &&
+               !((Current.TokenText == "not" || Current.TokenText == "compl") &&
+                 (!Current.getNextNonComment() ||
----------------
is this something verilog specific? what impact does this have on https://en.cppreference.com/w/cpp/keyword/not


================
Comment at: clang/lib/Format/TokenAnnotator.cpp:2001
+  /// Test whether \p Tok is a qualifier like "const".
+  const FormatToken *qualifierToSkipBack(const FormatToken *Tok) {
+    // FIXME: Before I added this function, every language except
----------------
probably could be a separate patch


================
Comment at: clang/lib/Format/TokenAnnotator.h:46
         LeadingEmptyLinesAffected(false), ChildrenAffected(false),
+        IsContinuation(Line.IsContinuation),
         FirstStartColumn(Line.FirstStartColumn) {
----------------
I feel this might be an unrelated change, is this something that needed for something specific, could you point to the unit tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121758/new/

https://reviews.llvm.org/D121758



More information about the cfe-commits mailing list