[libcxx-commits] [PATCH] D125606: [libc++][format] Improve string formatters

Sun Jun 5 08:38:00 PDT 2022

vitaut added a comment.

Looks great! My main suggestion is to rename `__parser` into something more appropriate (see an inline comment) and possibly separate the actual parsing logic from the parser output.

================
Comment at: libcxx/include/__format/formatter_output.h:79
+/// \param __out_it    The output iterator to write to.
+/// \param __parser    The parsed formatting output settings.
+/// \param __size      The (estimated) output column width. When the elements
----------------
Normally parser is something that does parsing. "The parsed formatting output settings" sounds more like format specs so I'd suggest renaming accordingly and making parser return these specs.

================
Comment at: libcxx/include/__format/formatter_output.h:118-131
+  auto __last = __format_spec::__detail::__estimate_column_width_fast(__str.begin(), __str.end());
+  ptrdiff_t __size = __last - __str.begin();
+  if (__size >= __parser.__width_)
+    return _VSTD::copy(__str.begin(), __str.end(), _VSTD::move(__out_it));
+
+  // Is there a non Unicode part?
+  if (__last != __str.end()) {
----------------
Not specific to this diff but I think it would be cleaner to fold Unicode/non-Unicode handling into `__estimate_column_width` and have a single check and copy here.

================
Comment at: libcxx/include/__format/formatter_string.h:35
+template <class _CharT>
+_LIBCPP_HIDE_FROM_ABI auto __format_string_view(basic_string_view<_CharT> __str,
+                                                output_iterator<const _CharT&> auto __out_it,
----------------
What is this function for? Why not use `__write_unicode` directly?

================
Comment at: libcxx/include/__format/formatter_string.h:81
     // TODO FMT Implement these improvements.
-    if (this->__has_width_field() || this->__has_precision_field())
-      return _Base::format(__str, __ctx);
+    if (_Base::__parser_.__width_as_arg_ || _Base::__parser_.__precision_as_arg_ || _Base::__parser_.__has_width() ||
+        _Base::__parser_.__has_precision())
----------------
Premature optimization: you could use negative values to indicate "no width/precision" and avoid extra checks here.

================
Comment at: libcxx/include/__format/parser_std_format_spec.h:1528
+  bool __alternate_form_ : 1 {false};
+  bool __zero_padding_ : 1 {false};
+  bool __locale_specific_form_ : 1 {false};
----------------
Another option is to fold this into fill/alignment (this would require an additional alignment enumerator making it larger). Not suggesting to change, just something to consider.

================
Comment at: libcxx/include/__format/parser_std_format_spec.h:1689
+    case 'A':
+      __type_ = __type::__float_hexadecimal_upper_case;
+      break;
----------------
nit: maybe replace `float_hexadecimal` with `hexfloat` for consistency with the naming in the standard (e.g. https://en.cppreference.com/w/cpp/io/manip/fixed)? The other names seem to be already consistent.

================
Comment at: libcxx/include/__format/parser_std_format_spec.h:1750-1751
+    __width_ = __format_spec::__substitute_arg_id(__arg);
+    if (__width_ == 0)
+      __throw_format_error("A format-spec width field replacement should have a positive value");
+  }
----------------
This is a weird check. I think we should open an LWG issue to allow zero dynamic width and remove this unnecessary check.

================
Comment at: libcxx/include/__format/parser_std_format_spec.h:1757
+
+    __precision_as_arg_ = false;
+    __precision_ = __format_spec::__substitute_arg_id(__arg);
----------------
nit: why assign `false` here but 0 to `__width_as_arg_` in `__substitute_width_arg_id` above? Make this consistent?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125606/new/

https://reviews.llvm.org/D125606