[libcxx-commits] [PATCH] D125606: [libc++][format] Improve string formatters

Mark de Wever via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Jun 16 11:58:34 PDT 2022

Mordante marked 3 inline comments as done.
Mordante added a comment.

@ldionne I've the suggested changes, but before putting it up for review again I want to see whether these changes impact other work-in-progress code. (Not merge conflicts, but incompatibilities.)

Comment at: libcxx/include/__format/parser_std_format_spec.h:1390-1391
+/// They default to false so when a new field is added it needs to be opted in
+/// explicitly. That way the behaviour of user-defined formatters won't change.
+struct __fields {
ldionne wrote:
> I am not sure I understand this comment.
As discussed, this is a left-over of an experiment. Removed the unneeded part.

Comment at: libcxx/include/__format/parser_std_format_spec.h:1403-1407
+// By not placing this constant in the formatter class it's not duplicated for
+// char and wchar_t.
+inline constexpr __fields __fields_string{
+    .__precision_ = true,
ldionne wrote:
> It seems to me like it might be cleaner to define it inside `formatter_string.h` (closer to its point of use), but this is just a suggestion, if you've been through these changes and feel like this is the right way to group things, go with it.
I prefer to keep it here. That way it's easy to see which options are used for a formatter. The other formatters will also get their specific flags here. I expect to look more often at the flags when working on the format specs then when working on the formatters.

Comment at: libcxx/include/__format/parser_std_format_spec.h:1459-1463
+  __alignment __alignment_ : 3;
+  __sign __sign_ : 2;
+  bool __alternate_form_ : 1;
+  bool __locale_specific_form_ : 1;
+  __type __type_ : 5;
ldionne wrote:
> I think it's nice that this is really tailor-made to fit into as little space as possible, however I do have a concern that it might not be robust against future changes. For example, this code would break if we added a (admittedly large) number of elements to the `__type` enumeration. I'm not sure what would make me feel more easy about this -- perhaps simply adding some sort of `static_assert` to make sure the code stays in sync would be enough?
We currently have 18 members in the enum. I'm only aware of two papers intending to add a field
- http://wg21.link/P2510 "Formatting pointers" adds an upper case for pointer
- http://wg21.link/P2286 "Formatting Ranges" adds a debug output for chars and strings

This would leave use with 12 unused values. I can add one bit, which would give us 44 values to use.

I've added a `static_assert` for the entire struct.

Comment at: libcxx/include/__format/parser_std_format_spec.h:1465-1466
+  bool __weekday_name_ : 1;
+  bool __month_name_ : 1;
ldionne wrote:
> I find it a bit confusing that we're jamming these chrono specific fields here. I think we should either have a separate `__chrono_parsed_specifications` with just the fields it needs, or make this some sort of `union { <things-needed-by-std-format-spec>; <things-needed-by-chrono-format-spec> }`.
I added a union. These two unions share `__alignment __alignment_ : 3;` since that field is shared between the two formatters. This means we can access a member of a not active member. However this is valid per http://eel.is/c++draft/class.union#def:active,union_member

    In a union, a non-static data member is active if its name refers to an object
    whose lifetime has begun and has not ended ([basic.life]).  At most one of the
    non-static data members of an object of union type can be active at any time,
    that is, the value of at most one of the non-static data members can be stored
    in a union at any time.

    [Note 1: One special guarantee is made in order to simplify the use of unions:
    If a standard-layout union contains several standard-layout structs that share
    a common initial sequence ([class.mem]), and if a non-static data member of an
    object of this standard-layout union type is active and is one of the
    standard-layout structs, it is permitted to inspect the common initial sequence
    of any of the standard-layout struct members; see [class.mem].  — end note]
Clearly this is intended to put the tag of a tagged union at the start, but we can use that for our purposes.

Comment at: libcxx/include/__format/parser_std_format_spec.h:1468
+  alignas(4) _CharT __fill_;
ldionne wrote:
> Do you think that makes sense? Otherwise, perhaps a comment is enough. Basically, it would be nice not to leave a raw hard-coded number unexplained, for newbies like me :-)
I moved this field after the __precision_. The next field __width_ is an int32_t and has the wanted alignment and avoids using magic values.

Comment at: libcxx/include/__format/parser_std_format_spec.h:1553-1556
+  // LWG 3576 will probably change this to always accept a Unicode code point
+  // To avoid changing the size with that change align the field so when it
+  // becomes 32-bit its alignment will remain the same. That also means the
+  // size will remain the same. (D2572 addresses the solution for LWG 3576.)
ldionne wrote:
> Perhaps we could centralize this `4` value with this comment, and then reuse it here and above where I left a similar review comment?
I moved this field after the `__precision_`. The next field `__width_` is an `int32_t` and has the wanted alignment and avoids using magic values.

  rG LLVM Github Monorepo



More information about the libcxx-commits mailing list