[PATCH] D83296: [clang-format] Add a MacroExpander.

Manuel Klimek via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Sep 19 09:23:25 PDT 2020


klimek added inline comments.


================
Comment at: clang/lib/Format/FormatToken.h:177
+/// EndOfExpansion:   0  0 0       2    1 0 1
+struct MacroContext {
+  MacroContext(MacroRole Role) : Role(Role) {}
----------------
sammccall wrote:
> "context" is often pretty vague - "MacroSource" isn't a brilliant name but at least seems to hint at the direction (that the FormatToken is the expanded token and the MacroSource gives information about what it was expanded from)
> 
> I don't feel strongly about this though, up to you.
MacroSource sounds like it is about the macro source (i.e. the tokens written for the macro).
I'd be happy to rename to MacroExpansion or MacroExpansionInfo or somesuch if you think that helps?


================
Comment at: clang/lib/Format/MacroExpander.cpp:81
+      nextToken();
+    }
+    if (Current->isNot(tok::r_paren))
----------------
sammccall wrote:
> this accepts `FOO(A,B,)=...` as equivalent to `FOO(A,B)=...`. Not sure if worth fixing.
We're generally accepting too much; I'd either want to restrict it fully, or basically be somewhat minimum/forgiving. Given that we can't get errors back to the user, I was aiming for the latter.


================
Comment at: clang/lib/Format/MacroExpander.cpp:88
+
+  bool parseExpansion() {
+    if (!Current->isOneOf(tok::equal, tok::eof))
----------------
sammccall wrote:
> (nit: I'd probably find this easier to follow as `if (equal) else if (eof) else` with parseTail inlined, but up to you)
I basically like having the implementation match the BNR.
That said, not feeling strongly about it. You're saying you'd duplicate the Def.Body.push_back in the if (eof)?

  if (Current->is(tok::equal) {
    nextToken();
    // inline parseTail
  } else if (Current->is(tok::eof) {
    Def.Body.push_back(Current);
  } else {
    return false;
  }

Generally, I personally find it easier to read the early exit.    


================
Comment at: clang/lib/Format/MacroExpander.cpp:142
+      auto Definition = Parser.parse();
+      Definitions[Definition.Name] = Definition;
+    }
----------------
sammccall wrote:
> nit: this is a copy for what seems like no reason - move `Parser.parse()` inline to this line?
Reason is that we need the name.


================
Comment at: clang/lib/Format/MacroExpander.cpp:186
+        Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+      pushToken(Tok);
+    }
----------------
klimek wrote:
> sammccall wrote:
> > klimek wrote:
> > > sammccall wrote:
> > > > you're pushing here without copying. This means the original tokens from the ArgsList are mutated. Maybe we own them, but this seems at least wrong for multiple expansion of the same arg.
> > > > 
> > > > e.g.
> > > > ```
> > > > #define M(X,Y) X Y X
> > > > M(1,2)
> > > > ```
> > > > 
> > > > Will expand to:
> > > > - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> > > > - 2, ExpandedArg, ExpandedFrom = [M]
> > > > - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer as the first one
> > > > 
> > > > Maybe it would be better if pushToken performed the copy, and returned a mutable pointer to the copy. (If you can make the input const, that would prevent this type of bug)
> > > Ugh. I'll need to take a deeper look, but generally, the problem is we don't want to copy - we're mutating the data of the token while formatting the expanded token stream, and then re-use that info when formatting the original stream.
> > > We could copy, add a reference to the original token, and then have a step that adapts in the end, and perhaps that's cleaner overall anyway, but will be quite a change.
> > > The alternative is that I'll look into how to specifically handle double-expansion (or ... forbid it).
> > > (or ... forbid it).
> > 
> > I'm starting to think this is the best option.
> > 
> > The downsides seem pretty acceptable to me:
> >  - it's another wart to document: on the other hand it simplifies the conceptual model, I think it helps users understand the deeper behavior
> >  - some macros require simplification rather than supplying the actual definition: already crossed this bridge by not supporting macros in macro bodies, variadics, pasting...
> >  - loses information: one expansion is enough to establish which part of the grammar the arguments form in realistic cases. (Even in pathological cases, preserving the conflicting info only helps you if you have a plan to resolve the conflicts)
> >  - it's another wart to document: 
> > 
> > Are there any others?
> > 
> My main concern is that it's probably the most surprising feature to not support.
Forbade multi-expansion.


================
Comment at: clang/lib/Format/Macros.h:82
+///
+/// Furthermore, expansion and definition are independent - a macro defined
+/// as "A()=a" and "A=a" can both be expanded  as "A()" and "A".
----------------
sammccall wrote:
> Is this saying that the functionlike vs objectlike distiction is not preserved?
> 
> This doesn't seem safe (unless the *caller* is required to retain this information).
> e.g. 
> ```
> #define NUMBER int
> using Calculator = NUMBER(); // does expansion consume () or not?
> ```
Fixed. I thought we'd get away without it, but it's simple enough to fix and we have enough suprises as is.


================
Comment at: clang/lib/Format/Macros.h:86
+public:
+  using ArgsList = llvm::ArrayRef<llvm::SmallVector<FormatToken *, 8>>;
+
----------------
sammccall wrote:
> (Seems a little odd that these pointers to external FormatTokens aren't const... I can believe there's a reason though)
We modify the tokens by adding the macro context.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



More information about the cfe-commits mailing list