[PATCH] D88299: [clang-format] Add MacroUnexpander.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 3 07:56:18 PST 2021


sammccall added a comment.

Thanks a lot for all the time explaining, I get this at least at a high level now.
I have trouble following the ins and outs of the algorithm, but I think I understand most of it and tests cover the rest.

Regarding tests: I have trouble reasoning about whether all cases are tested. I wonder if we can easily throw this against (internal) coverage + mutation testing infrastructure? I'm not a massive believer in that stuff usually, but this seems like a good candidate.

Lots of comments around making this easier for confused people like me to understand, partly because I don't understand it well enough to suggest more substantive changes.
Throughout, I think it'd be nice to be explicit about:

- which structure tokens/lines are part of: spelled/expanded/unexpanded. (Including getting rid of input/output/incoming/original, which AFAICT are synonyms for these)
- which lines are complete, and which are partial (being added to as we go)

It took me a while to understand the control flow and to tell "who's driving" - I think it was hard for me to see that processNextUnexpanded was basically a leaf, and that unexpand/startUnexpansion/endUnexpansion/continueUnexpansionUntil were the layer above, and add() was the top. Maybe there's naming that would clarify this better, but I don't have great suggestions, and I'm not sure if this is a problem other people would have.
Maybe an example trace showing the internal method calls when parsing a small example might help... but again, not sure.



================
Comment at: clang/lib/Format/MacroUnexpander.cpp:52
+// the token, its parent and whether it is the first token in the line.
+void MacroUnexpander::forEachToken(
+    const UnwrappedLine &Line,
----------------
this doesn't use any state, right?

it could be static or even private to the impl file.

replacing std::function with a template param allow this loop to be specialized for the one callsite - up to you, maybe it doesn't matter much, but it's not very invasive


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:66
+
+// Unexand \p Token, given its parent \p OriginalParent in the incoming
+// unwrapped line. \p First specifies whether it is the first token in a given
----------------
I find using all of spelled/expanded/unexpanded, input/incoming/output/outgoing, and original, unclear.
(Particularly after OriginalParent is propagated around by that name without comment, and because "original" refers to the expanded structure, which is *not* the first one created)
Can we call this "ExpandedParent in the expanded unwrapped line"?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:66
+
+// Unexand \p Token, given its parent \p OriginalParent in the incoming
+// unwrapped line. \p First specifies whether it is the first token in a given
----------------
sammccall wrote:
> I find using all of spelled/expanded/unexpanded, input/incoming/output/outgoing, and original, unclear.
> (Particularly after OriginalParent is propagated around by that name without comment, and because "original" refers to the expanded structure, which is *not* the first one created)
> Can we call this "ExpandedParent in the expanded unwrapped line"?
Unexand -> unexpand


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:77
+  // stream, we need to continue the unexpansion until we find the right token
+  // if it is part of the unexpanded token stream.
+  // Note that hidden tokens can be part of the unexpanded stream in nested
----------------
I can't work out what "it" refers to in this sentence.

(and "spelled" token stream?)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:80
+  // macro calls.
+  // For example, given: BRACED(a) {a}
+  // And the call: BRACED(BRACED(x))
----------------
(in these examples throughout, #define BRACED(a) {a} would make it clearer that this is a macro definition and not code)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:82
+  // And the call: BRACED(BRACED(x))
+  // The outer macro call will be BRACED({a}), and the hidden tokens '{' and
+  // '}' can be found in the unexpanded macro stream of that level.
----------------
It would be helpful to complete the example by spelling out which token you're adding, which the correct parent is, and which tokens you need to "expand over" to make it available.

I *think* the answer to the first two is - when you're adding the `a` then its proper parent is the inner `(` in `BRACED(BRACED(`... but I don't know the last part.


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:84
+  // '}' can be found in the unexpanded macro stream of that level.
+  if (!Unexpanded.empty() && Token->MacroCtx &&
+      (Token->MacroCtx->Role != MR_Hidden ||
----------------
is it possible that you may need to unexpand more than the innermost macro?

e.g. BRACED(ID(ID(BRACED(ID(ID(a)))))) expands to {{a}} and the parents of `a` and inner `{` each come from a couple of macro-levels up.

(I don't totally understand the logic here, so the answer's probably "no"...)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:103
+
+// Ensures that:
+// Lines.back() is the line that has \p OriginalParent or its unexpanded
----------------
I had trouble understanding the what this function does at a high level: i.e. *why* you'd want this.

Maybe:
```
Adjusts the stack of active (unexpanded) lines so we're ready to push tokens.
The tokens to be pushed are children of ExpandedParent (in the expanded code).
This may entail:
 - creating a new line, if the parent is on the active line
 - popping active lines, if the parent is further up the stack
```

and s/First/ForceNewLine/ to avoid documenting it?

(impl looks good though once I understood what it does!)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:156
+FormatToken *MacroUnexpander::getParentInOutput(FormatToken *Parent) {
+  auto I = TokenToParentInOutput.find(Parent);
+  if (I == TokenToParentInOutput.end())
----------------
nit: s/find/lookup/, then you only have to deal with pointers and this becomes a bit easier to read IMO


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:173
+  assert(Token->MacroCtx);
+  // A single token can be the only result of a macro call:
+  // Given: ID(x, y) ;
----------------
This raises another point: a macro can have an empty body.
AFAICT this fundamentally isn't supported here, as we're driven by the expanded token stream.
I guess it makes sense to handle this elsewhere in clang-format (or even not handle it) but it should be documented somewhere.


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:237
+  // Start unexpansion for all calls for which this token is the first token
+  // generated by the call.
+  for (size_t I = Unexpanded.size(); I < Token->MacroCtx->ExpandedFrom.size();
----------------
assert that this number is equal to StartOfExpansion?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:238
+  // generated by the call.
+  for (size_t I = Unexpanded.size(); I < Token->MacroCtx->ExpandedFrom.size();
+       ++I) {
----------------
nit: index arithmetic obscures what's going on a bit.
You could write

```
ArrayRef<FormatToken *> StartedMacros = makeArrayRef(Token->MacroCtx->ExpandedFrom).drop_front(Unexpanded.size());
for (FormatToken *ID : llvm::reverse(StartedMacros)) {
...
}
```
but up to you

It's not obvious to me *why* we're iterating in reverse order BTW: i.e. why the order of the `Unexpanded` stack is opposite the order of the `ExpandedFrom` stack.
So maybe just a comment to reinforce that, like (// ExpandedFrom is outside-in order, we want inside-out)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:263
+// given \p Token.
+bool MacroUnexpander::continueUnexpansionUntil(FormatToken *Token) {
+  assert(!Unexpanded.empty());
----------------
maybe unexpandActiveMacroUntil or so?

Something to hint that this stops at the end of the top-of-stack macro.

(and to avoid the term "unexpansion stream" which isn't used/defined anywhere else)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:269
+  // skip it.
+  for (; Unexpanded.back().I != Unexpanded.back().End &&
+         Unexpanded.back().I->Tok != Token;) {
----------------
nit: while


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:273
+  }
+  assert(Unexpanded.back().I == Unexpanded.back().End ||
+         Unexpanded.back().I->Tok == Token);
----------------
nit: this assert is just the opposite of the while condition, drop it?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:282
+         (Unexpanded.size() >= Token->MacroCtx->EndOfExpansion));
+  if (!Token->MacroCtx)
+    return;
----------------
nit: Token->MacroCtx is checked at the callsite, and startUnexpansion asserts it as a precondition - do that here too for symmetry?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:303
+      // the comma and the remaining tokens in the macro call will potentially
+      // end up in the line when we finishe the expansion.
+      // FIXME: Add the information which arguments are unused, and assert
----------------
finishe -> finish


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:310
+    // Expand the closing parenthesis, if it exists, including an optional
+    // trailing comment.
+    for (auto T = Unexpanded.back().I; T != Unexpanded.back().End; ++T) {
----------------
I can't work out if this is supposed to say comma or comment :-)

If comma - is that a thing?
If comment - why would a comment be considered part of the unexpansion of a macro invocation, rather than a sibling to the macro? How do we know what trails is a comment - should we have an assertion?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:319
+// If visible, add the next token of the unexpanded token sequence to the
+// output.
+bool MacroUnexpander::processNextUnexpanded() {
----------------
describe the return value, it's not obvious


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:322
+  FormatToken *Token = Unexpanded.back().I->Tok;
+  // llvm::dbgs() << "ProcessNext: " << Token->TokenText << "\n";
+  ++Unexpanded.back().I;
----------------
want to keep this?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:345
+  // unexpanded call.
+  if (!Token->MacroCtx && Token->isOneOf(tok::comma, tok::r_paren) &&
+      !MacroCallStructure.empty()) {
----------------
nit: easier to see the three cases being handled independently (comma, rparen, lparen) if they're all siblings instead of grouping two together.


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:348
+    if (Token->is(tok::comma)) {
+      TokenToParentInOutput
+          [MacroCallStructure.back().Line->Tokens.back()->Tok] = Token;
----------------
This line is both subtle and cryptic :-).
```
// New unwrapped-lines from unexpansion are normally "chained" - treated as
// children of the last token of the previous line. 
// Redirect them to be treated as children of the comma instead.
// (only affects lines added before we push more tokens into MacroStructure.Line)
```


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:366
+  }
+  if (!Token->MacroCtx && Token->is(tok::l_paren)) {
+    MacroCallStructure.push_back(
----------------
This feels like a stupid question, but how do we know that this non-macro-parenthesis has anything to do with macros?
(Fairly sure the answer is "processNextUnexpanded() is only called when the token is macro-related", it'd be nice to have this precondition spelled out somewhere)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:367
+  if (!Token->MacroCtx && Token->is(tok::l_paren)) {
+    MacroCallStructure.push_back(
+        {currentLine(), parentLine().Tokens.back()->Tok, Token});
----------------
please assign to the struct fields or use /*Foo=*/ comments

All the data flow around this struct is local but the order of fields isn't :-)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:369
+        {currentLine(), parentLine().Tokens.back()->Tok, Token});
+    TokenToParentInOutput[MacroCallStructure.back().RedirectParentFrom] = Token;
+    pushToken(Token);
----------------
nit: = MacroCallStructure.back().RedirectParentTo
This line + line 361 are the keys to understanding what the RedirectParentFrom/To do, so it'd be helpful for them to explicitly use those variables.

(or use the source expressions for both... but in that case the fields may need docs)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:391
+    return;
+  assert(State == InProgress && Unexpanded.empty());
+  State = Finalized;
----------------
nit: finished()


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:396
+  // the the toplevel null token.
+  assert(Output.Tokens.size() == 1 && !Output.Tokens.front()->Children.empty());
+
----------------
nit: pull out a named reference for Output.Tokens.front() after the size assertion.

assert(NullToken.is(tok::null) && !NullToken.children.empty()) may even obviate the need for a comment :-)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:398
+
+  // The first line becomes the top level line in the resuling unwrapped line.
+  auto *I = Output.Tokens.front()->Children.begin();
----------------
resulting


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:406
+  for (auto *E = Output.Tokens.front()->Children.end(); I != E; ++I) {
+    if ((*I)->Tokens.empty())
+      continue;
----------------
hmm, when is this possible? are these literal blank lines? where do they come from?


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:409
+
+    // Every subsequent line will become a child of the last token in the
+    // previous line, which is the token prior to the first token in the line.
----------------
Isn't this just the end of line from the previous iteration of this loop? Why the global map populated by pushToken?
(I feel like i'm missing something obvious here)


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:411
+    // previous line, which is the token prior to the first token in the line.
+    auto L = TokenToPrevious.find((*I)->Tokens.front()->Tok);
+    assert(L != TokenToPrevious.end());
----------------
nit: again .lookup()


================
Comment at: clang/lib/Format/MacroUnexpander.cpp:423
+
+void MacroUnexpander::pushToken(FormatToken *Token, Line *L) {
+  L = L ? L : currentLine();
----------------
nit: i'd consider calling this appendToken, since lots of the stuff in this class deals with stacks but this doesn't)

(I say a lot of nasty things about java but ArrayList.add is 1000x better name than vector::push_back)


================
Comment at: clang/lib/Format/Macros.h:28
 /// the spelled token stream into unwrapped lines that best resemble the
 /// structure of the expanded unwrapped lines.
 ///
----------------
This would be a good place to explicitly introduce the name "unexpanded" for what comes out of the unexpander.

This para gives names to the token streams, but not as clearly to the UnwrappedLines parsed out of them.

I think the fact that the tokens *alias* between the streams/lines, and so the final formatting of the expanded lines "writes through" tokentype etc to the unexpanded lines, is an important design point worth emphasizing.

(This part is mostly structure around "what happens", with the data sets secondary - I think I'd personally find the reverse easier to follow but YMMV)


================
Comment at: clang/lib/Format/Macros.h:30
 ///
 /// When formatting, clang-format formats the expanded unwrapped lines first,
 /// determining the token types. Next, it formats the spelled unwrapped lines,
----------------
would s/formats/annotates/ be inaccurate?

This is just my poor understanding of the code, but it wasn't obvious to me that annotation is closely associated with formatting and not with parsing UnwrappedLines.


================
Comment at: clang/lib/Format/Macros.h:135
+/// Converts a stream of unwrapped lines of expanded tokens into unwrapped lines
+/// containing the tokens of the macro call in a way that the resulting
+/// unwrapped lines of the macro call best resemble the split of unwrapped lines
----------------
I know it's the common case, but I think saying "the macro call" here is misleading, because it quickly becomes apparent reading the code that the scope *isn't* one macro call, and (at least for me) it's easy to get hung up on not understanding what the scope is. (AIUI the scope is actually one UL of *output*... so the use of plural there is also confusing).

I think a  escription could be something like:

Converts a sequence of UnwrappedLines containing expanded macros into a single UnwrappedLine containing the macro calls.
This UnwrappedLine may be broken into child lines, in a way that best conveys the structure of the expanded code.
...
In the simplest case, a spelled UnwrappedLine contains one macro, and after expanding it we have one expanded UnwrappedLine.
In general, macro expansions can span UnwrappedLines, and multiple macros can contribute tokens to the same line.
We keep consuming expanded lines until:
 - all expansions that started have finished (we're not chopping any macros in half)
 - *and* we've reached the end of a *spelled* unwrapped line
A single UnwrappedLine represents this chunk of code.

After this point, the state of the spelled/expanded stream is "in sync" (both at the start of an UnwrappedLine, with no macros open), so the Unexpander can be thrown away and parsing can continue.

(and then launch into an example)



================
Comment at: clang/lib/Format/Macros.h:163
+  MacroUnexpander(unsigned Level,
+                  const std::map<FormatToken *, std::unique_ptr<UnwrappedLine>>
+                      &Unexpanded);
----------------
any reason for std::map rather than DenseMap, here and elsewhere? (Only good reasons i'm aware of are sorting and pointer stability)


================
Comment at: clang/lib/Format/Macros.h:174
+  /// that needs to be processed to finish an macro call.
+  bool finished() { return State == InProgress && Unexpanded.empty(); }
+
----------------
const? or do we not care


================
Comment at: clang/lib/Format/Macros.h:174
+  /// that needs to be processed to finish an macro call.
+  bool finished() { return State == InProgress && Unexpanded.empty(); }
+
----------------
sammccall wrote:
> const? or do we not care
Maybe comment that when finished() is true, it's possible to call getResult() and stop processing... but also valid to continue calling addLine(), if this isn't a good place to stop.


================
Comment at: clang/lib/Format/Macros.h:182
+  /// that for multiple top-level lines, each subsequent line will be the
+  /// child of the last token in its predecessor.
+  /// If a token in a macro argument is a child of a token in the expansion,
----------------
Maybe a note like "this representation is chosen because it can be opaque to the UnwrappedLineParser, but the Formatter treats it appropriately" or something.
I think it should be clear that this representation isn't really "natural" at this layer (or if it is, why).
Maybe an example would help.


================
Comment at: clang/lib/Format/Macros.h:189
+  /// formatted line "int x;" and "int y;" would both be new separate lines (at
+  /// different indentation levels). In the result, "int x;" will be a child of
+  /// the opening parenthesis in "C(" and "int y;" will be a child of the ","
----------------
ASCII art of some sort would help :-)


================
Comment at: clang/lib/Format/Macros.h:192
+  /// token.
+  UnwrappedLine getResult();
+
----------------
nit: giving `getResult()` a side-effect but also making it idempotent is a bit clever/confusing.

Either exposing `void finalize();` + `UnwrappedLine get() const`, or `UnwrappedLine takeResult() &&`, seems a little more obvious.


================
Comment at: clang/lib/Format/Macros.h:218
+  enum UnexpanderState {
+    Start,      // No macro expansion was found in the input yet.
+    InProgress, // During a macro unexpansion.
----------------
you have this "no expansions are open, but we already didn't find any" state.
The effect of this is that finished() returns false so the caller keeps looping.

But a correct caller will never rely on this:
 - the first line a caller feeds must have macro tokens in it, or our output will be garbage AFAICT
 - calling getResult() without feeding in any lines is definitely not correct

It seems we could rather assert on these two conditions, and eliminate the Start/InProgress distinction.
That way incorrect usage is an assertion crash, rather than turning into an infinite loop.


================
Comment at: clang/lib/Format/Macros.h:220
+    InProgress, // During a macro unexpansion.
+    Finalized,  // Past macro unexpansion, the result is finalized.
+  };
----------------
similarly, the InProgress/Finalized distinction would be eliminated by making `takeResult()` destructive, and requiring (through the type system or an assert) that it be called only once.
It doesn't seem that it needs to be part of the NDEBUG runtime control flow.


================
Comment at: clang/lib/Format/Macros.h:249
+  // indent depending on the semantic structure, which is not desired.
+  Line Output;
+
----------------
This is very closely related to what you return from "getResult" - not quite the same, but Output vs Result doesn't seem to hint at the difference. Could we use the same name for both?


================
Comment at: clang/lib/Format/Macros.h:253
+  // token is the parent token for that line.
+  llvm::SmallVector<Line *, 4> Lines;
+
----------------
ActiveUnexpandedLines?

(Line is very overloaded here)


================
Comment at: clang/lib/Format/Macros.h:278
+  // once we're past the comma in the unexpansion.
+  llvm::DenseMap<FormatToken *, FormatToken *> TokenToParentInOutput;
+
----------------
ParentInExpandedToParentInUnexpanded?

(current name implies that it maps a token to *its* parent. It also uses the input/output names, rather than expanded/unexpanded - it would be nice to be consistent)


================
Comment at: clang/lib/Format/Macros.h:286
+    // Our current position in the unexpansion.
+    std::list<UnwrappedLineNode>::iterator I;
+    // The end of the unexpanded token sequence.
----------------
consider calling these NextSpelled and EndSpelled to be really explicit?
Since the type doesn't really clarify which sequence is being referred to.


================
Comment at: clang/lib/Format/Macros.h:291
+
+  // Stack of unexpanded macro calls for which we're in the middle
+  // of an expansion.
----------------
I think this is a confusing use of "unexpanded".

These macros that we're in the process of unexpanding. So the past tense doesn't seem right, they don't seem more "unexpanded" than they do "expanded", at least to me.

Maybe ActiveExpansions or so?


================
Comment at: clang/lib/Format/Macros.h:314
+  // \- )
+  llvm::SmallVector<MacroCallState, 4> MacroCallStructure;
+
----------------
nit: if you're going to specify SmallVector sizes, I don't understand why you'd size Unexpanded vs MacroCallStructure differently - they're going to be the same size, right?

These days you can leave off the size entirely though, and have it pick a value


================
Comment at: clang/lib/Format/Macros.h:142
+/// When getting the formatted lines of the expansion via the \c addLine method:
+/// -> class A {
+/// -> public:
----------------
klimek wrote:
> klimek wrote:
> > sammccall wrote:
> > > sammccall wrote:
> > > > I'm a bit confused by these arrows. It doesn't seem that they each point to an unwrappedline passed to addLine?
> > > This example didn't really help me understand the interface of this class, it seems to be a special case:
> > >  - the input is a single block construct (rather than e.g. a whole program), but it's not clear why that's the case.
> > >  - the output (unexpanded form) consists of exactly a macro call with no leading/trailing tokens, which isn't true in general
> > > 
> > > If the idea is to provide as input the minimal range of lines that span the macro, we should say so somewhere. But I would like to understand why we're not simply processing the whole file.
> > Why not? That is the intention? Note that high-level we do not pass class definitions in one unwrapped line; if they ever go into a single line it's done in the end by a special merging pass (yes, that's a bit confusing).
> Re: input is a single construct - that is not the case (see other comment)
> Re: leading / trailing tokens: I didn't want to make it too complex in the example.
> Re: input is a single construct - that is not the case (see other comment)

A class is definitely a single construct :-) It sounds like that's not significant to the MacroUnexpander though, so it feels like a red herring to me.

> Re: leading / trailing tokens: I didn't want to make it too complex in the example.
That seems fine, I think the complexities of the general case need to be mentioned somewhere because the API reflects them. But you're right, the primary example should be simple.
I think a tricky example (maybe the `#define M ; x++` one?) could be given on one of addLine/finish/getResult maybe.



================
Comment at: clang/lib/Format/Macros.h:154
+/// -> })
+class MacroUnexpander {
+public:
----------------
klimek wrote:
> sammccall wrote:
> > I get the symmetry between the expander/unexpander classes, but the name is making it harder for me to follow the code.
> >  - the extra compound+negation in the name makes it hard/slow to understand, as I'm always thinking first about expansion
> >  - the fact that expansion/unexpansion are not actually opposites completely breaks my intuition. It also creates two different meanings of "unexpanded" that led me down the garden path a bit (e.g. in the test).
> > 
> > Would you consider `MacroCollapser`? It's artificial, but expand/collapse are well-known antonyms without being the same word.
> > 
> > (Incidentally, I just realized this is why I find "UnwrappedLine" so slippery - the ambiguity between "this isn't wrapped" and "this was unwrapped")
> Happy to rename, but first wanted to check in - I use "unexpanded token stream" quite often to refer to the macro call. Perhaps we should also find different wording for that then?
> 
> Perhaps we should call this MacroLineMatcher btw? This is not creating anything new - the resulting token sequence is the "unexpanded token sequence" with the exact same tokens, the special thing is that they're matched into unwrapped lines.
I think unexpanded/unexpander are reasonable names, having understood this better, but with caveats.

It's important to distinguish between the pre-expansion state ("spelled"?) and the post-unexpansion state ("unexpanded?").
The UnwrappedLines are vitally different, but the token stream is the same as you point out.

When referring to the token stream, I think "spelled" is probably less confusing (that's where the token stream fundamentally comes from), and explicitly mentioning somewhere that the token sequence encoded by the unexpanded lines is the same original spelled stream.


================
Comment at: clang/unittests/Format/MacroUnexpanderTest.cpp:202
+
+TEST_F(MacroUnexpanderTest, Identifier) {
+  auto Macros = createExpander({"X=x"});
----------------
There's no test for a macro that expands to nothing - is this supported?


================
Comment at: clang/unittests/Format/MacroUnexpanderTest.cpp:541
+  Matcher U(Call, Lex);
+  EXPECT_THAT(Unexp.getResult(), matchesLine(line(U.consume("X(a, b, c)"))));
+}
----------------
nit: Result


================
Comment at: clang/unittests/Format/MacroUnexpanderTest.cpp:207
+  Unexp.addLine(line(E.consume(lex("};"))));
+  EXPECT_TRUE(Unexp.finished());
+  Matcher U(Call);
----------------
klimek wrote:
> sammccall wrote:
> > you have lots of assertions that this is true, but none that it is false
> Yeah, that's a white-box assertion - finished() is false the vast majority of the time, so testing the true cases is the important part - happy to add tests if you think it's worth it.
Yes - a basic test that it's not always true would be useful I think (maybe the `#define M ;x++` case would be useful for showing the expected loop and finished() values)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299



More information about the cfe-commits mailing list