[PATCH] D65043: [Format] Add C++20 standard to style options

Brian Gesiak via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 24 10:05:33 PDT 2019


modocache marked 3 inline comments as done.
modocache added a comment.

> It sounds like currently they're very different, and you're proposing to make them basically the same. I think that's a good thing.

I 100% agree with this statement, and thank you @sammccall for the suggestion to move in this direction.

> The obvious problem here is that this may well break existing projects that use Cpp11 and C++14/17 features. I think we can punt on this by making Cpp11/14 still enable 14/17 in LangOpts for now (so we don't change behavior, just give people a better way to specify their intent) and change it later if needed.

I like this idea. `LS_Cpp14` and `LS_Cpp17` are new options that we'll introduce and so clang-format should interpret them very precisely, whereas `LS_Cpp11` (for now) will remain a little ambiguous. Hopefully some thorough documentation will prevent some user confusion here.

The codebase I'm applying formatting to uses `-std=c++17` and `-fcoroutines-ts` to compile, but specifies `LS_Cpp11` in its .clang-format file -- as you mentioned, the intent was to simply use "new" C++ formatting, for whatever clang-format decided "new" to mean. I'd like to reason "out loud" about what changes we can make to clang-format, and what my codebase's migration path would look like should we make those changes. Based on the discussion above I'm thinking we make these commits/changes to clang-format itself:

1. Add the `enum LanguageStandard` @sammccall suggested above verbatim, thereby allowing users to specify an arbitrary standard such as `LS_Cpp17`. To split up the work for easier code review, we might want to consider just updating the .clang-format parsing to recognize `"Cpp17"`, but internally still have clang-format treat it as `LS_Cpp11`. This is because step (2) is a bit of a doozy.
2. Audit the existing uses of `FormatStyle::LS_Cpp{03,11,Auto}` in the clang-format codebase and modify them to instead check for the appropriate standard. We might want this work to land as several small commits.
  1. For example, formatting that should only be applied to C++17 should be modified, and instead of checking for `LS_Cpp11`, it should check for `LS_Cpp17 || LS_Cpp11` (`LS_Cpp11` is still included in the check because we're using that to mean "new"). Commits with changes like this can land without any impact to existing users.
  2. Formatting that should only be applied to C++20 should be modified to check for `LS_Cpp20` //instead of// `LS_Cpp11`. These commits //will// change end behavior for users. In my codebase, for example, we would probably want to update our .clang-format, from `"Cpp11"` to `"Cpp20"`, before we update to a clang-format that includes these commits, because otherwise our `co_yield` keywords will begin to get formatted incorrectly.
3. Change .clang-format parsing to recognize the new enum cases like `"Cpp17"` as `LS_Cpp17`, not `LS_Cpp11` (see item 1).
4. Commit the `CoroutinesTS` option from D65044 <https://reviews.llvm.org/D65044>. My codebase would switch from `LS_Cpp20` to `LS_Cpp17` and `CoroutinesTS` and almost no formatting should change.

If the above series of patches sounds reasonable, I wouldn't mind working on some or all of them, so let me know!



================
Comment at: clang/docs/ClangFormatStyleOptions.rst:2227
+  * ``LS_Cpp20`` (in configuration: ``Cpp20``)
+    Use features of C++20 and C++2a (e.g.: treating ``co_yield`` as a keyword,
+    not an identifier, so ``co_yield++ i`` is formatted as ``co_yield ++i``).
----------------
Quuxplusone wrote:
> C++2a //will// be C++20, barring any radically unforeseen events. So saying "C++20 and C++2a" is redundant. Personally I would follow GCC/Clang's lead and say "C++2a" until the standard is actually out.
Thank you, I'll do that. I wasn't sure about the naming conventions. While we're talking about this, mind if I ask: why did "C++1z" use "z" whereas "C++2a" use "a"? Once C++20 is actually out, is the next version "C++2b", or does "C++2a" start referring to C++23? Sorry for the rookie questions.


================
Comment at: clang/include/clang/Format/Format.h:1878
     LS_Cpp11,
+    /// Use features of C++20 and C++2a (e.g.: treating ``co_yield`` as a
+    /// keyword, not an identifier, so ``co_yield++ i`` is formatted as
----------------
Quuxplusone wrote:
> Again, C++2a is likely a synonym for C++20.
> Three lines earlier, you might want to change "C++1z" to "C++17" (and grep the codebase for other instances of "1z").
Cool, will do, thank you!


================
Comment at: clang/unittests/Format/FormatTest.cpp:3721
+      "        aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) <= >\n"
+      "    5) {\n"
       "}");
----------------
Quuxplusone wrote:
> This doesn't seem to test what you set out to test, does it? `(x) <= > 5` isn't a valid C++ expression anyway. Maybe what you want to test here is that clang-format is willing to reformat pre-C++2a code
> 
>     LongName<&LongerName::operator<=> x;
> 
> into
> 
>     LongName<
>         &LongerName::operator<=
>     > x;
> 
> (that is, that it's willing to insert a line break between `<=` and `>`). However, if clang-format //refused// to insert a line break in that one position even in C++11 mode, would anything of value really be lost?
Absolutely right, I thought of this test not as a signal of what I want to happen, but rather to demonstrate what does happen. I'll either update this test to make that clear, or add a test like the one you suggested instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65043/new/

https://reviews.llvm.org/D65043





More information about the cfe-commits mailing list