[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Feb 23 03:33:59 PST 2023


sammccall added a comment.

Thanks, this all looks good, at least as far as I understand it! The two-passes-over-the-whole-file approach is easier for me to follow, too.

Mostly just comment nits, but I have one annoying question left about using macros with the wrong arity - I thought this was just an implementation limitation, but it sounds like you want it to be a feature, so I'd like to nail it down a little...



================
Comment at: clang/include/clang/Format/Format.h:2750
+  ///
+  /// Code will be parsed with macros expanded, and formatting will try to best
+  /// match the structure of the expanded call.
----------------
I find "try to match the structure of the expanded call" a bit hard to follow from a user perspective.

Maybe "Code will be parsed with macros expanded, in order to determine how to interpret and format the macro arguments"?


================
Comment at: clang/include/clang/Format/Format.h:2753
+  ///
+  /// For example, with the macro "A(x)=x", the code
+  /// \code
----------------
This is a good example. I think you should spell out both sides, and probably start with the simpler case (no macro definition) to motivate the problem.

```
For example, the code:
  A(a*b);
will usually be interpreted as a call to a function A, and the multiplication expression will be formatted as `a * b`.

If we specify the macro definition:
  Macros:
    - A(x)=x

the code will now be parsed as a declaration of the variable b of type a*,
and formatted as `a* b` (depending on pointer-binding rules).
```


================
Comment at: clang/include/clang/Format/Format.h:2757
+  /// \endcode
+  /// will be formatted as a declaration of the variable \c b of type \c A*
+  /// (depending on pointer-binding rules)
----------------
type a*, not A*


================
Comment at: clang/include/clang/Format/Format.h:2762
+  /// \endcode
+  /// instead of as multiplication.
+  std::vector<std::string> Macros;
----------------
I think the restrictions should be spelled out here: object-like and function-like macros are both supported, macro args should be used exactly once, macros should not reference other macros.


================
Comment at: clang/lib/Format/UnwrappedLineParser.cpp:225
+      }
+      Callback.finishRun();
+    }
----------------
klimek wrote:
> sammccall wrote:
> > I'm sure you got this right, but it's hard for me to evaluate this code because I don't know what `UnwrappedLineConsumer::finishRun()` is for - it's not documented and the implementation is mysterious.
> > 
> > You might consider adding a comment to that interface/method, it's not really related to this change though.
> Done. Not sure I've put too much info about how it's going to be used into the interface, where it doesn't really belong.
Thanks, comment seems just right to me.


================
Comment at: clang/lib/Format/UnwrappedLineParser.h:88
+/// - for different combinations of #if blocks
+/// - for code where macros are expanded and the code with the original
+///   macro calls.
----------------
this is a bit of a garden path sentence: `for code where ((macros are expanded) and (the code with (the original macro calls) ... ERR_MISSING_PREDICATE))`

maybe `when macros are involved, for the expanded code and the as-written code`?


================
Comment at: clang/lib/Format/UnwrappedLineParser.h:260
+  // was expanded from a macro call.
+  bool containsExpansion(const UnwrappedLine &Line);
+
----------------
nit: should these added functions be const?

(maybe this class doesn't bother with const - computePPHash is const though)


================
Comment at: clang/unittests/Format/FormatTest.cpp:22584
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+
----------------
klimek wrote:
> sammccall wrote:
> > If this is assign_or_return, the treatment of "c" is fairly strange. Also you are mostly calling this with only two args. Maybe drop C and use a different macro for the 3-arg case?
> ASSIGN_OR_RETURN allows 3 args, though; this is basically the thing I'd propose to configure for ASSIGN_OR_RETURN in general; is there anything specific you don't like about this?
OK, this is a bit sticky.
 - `ASSIGN_OR_RETURN` is almost always called with two args
 - some versions of `ASSIGN_OR_RETURN` support an optional third arg (there's no canonical public version)
 - these emulate overloading using variadic macro tricks
 - this patch doesn't claim to support either variadic macros or overloading

So the principled options seem to be:
 - macros have fixed arity: clang-format can support the two-arg version of `ASSIGN_OR_RETURN` but not the three-arg version. (This is what I was assuming)
 - macros have variable arity: the public docs need to describe how passing too many/too few args is treated, because this isn't obvious and it sounds like we want people to rely on it. This feels like stretching "principled" to me - we're relying on the expansion still parsing correctly when args are missing, which basically seems like coincidence.
 - macros support overloading: instead of looking up macros by name, the key is `(string Name, optional<int> Arity)`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170



More information about the cfe-commits mailing list