[libcxx-commits] [PATCH] D103413: [libc++][format] Implement Unicode support.

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Wed Sep 22 12:55:11 PDT 2021


ldionne requested changes to this revision.
ldionne added inline comments.
This revision now requires changes to proceed.


================
Comment at: libcxx/benchmarks/CMakeLists.txt:148-149
           LINK_FLAGS "${BENCHMARK_TEST_LIBCXX_LINK_FLAGS}"
-          CXX_STANDARD 17
-          CXX_STANDARD_REQUIRED YES
+          CXX_STANDARD 20
+          CXX_STANDARD_REQUIRED NO
           CXX_EXTENSIONS NO)
----------------
Can we make those changes as part of another review? Also, let's set `CXX_STANDARD_REQUIRED YES` instead -- all supported compilers should support C++20 now.


================
Comment at: libcxx/benchmarks/std_format_spec_string_unicode.bench.cpp:7
+
+// TODO FMT Remove once libc++ requires C++20 support.
+#if __cplusplus > 201703L && !defined(_LIBCPP_HAS_NO_UNICODE)
----------------
I don't understand this `TODO`. We do require C++20 support.


================
Comment at: libcxx/benchmarks/std_format_spec_string_unicode.bench.cpp:17-19
+#ifdef _LIBCPP_HAS_NO_UNICODE
+#error The benchmark requires Unicode support enabled.
+#endif
----------------
I'm not sure this is useful -- if the library doesn't support unicode, something in the benchmark is going to fail anyway. It would be nice to remove as many libc++ specific things from the benchmark as possible.


================
Comment at: libcxx/benchmarks/std_format_spec_string_unicode.bench.cpp:21
+
+// Always enable asserts since they are used compile-time not run-time.
+#ifdef NDEBUG
----------------
The benchmarks should be compiled without assertions anyway -- it seems pretty hacky to undefine this here, no?


================
Comment at: libcxx/benchmarks/std_format_spec_string_unicode.bench.cpp:27
+
+using namespace std::__format_spec;
+
----------------
If we're going to do something like this, can we instead do `namespace format_spec = std::__format_spec` or something similar?


================
Comment at: libcxx/benchmarks/std_format_spec_string_unicode.bench.cpp:84
+  static_assert(
+      sizeof(
+          "Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich") ==
----------------
This line break is really odd.


================
Comment at: libcxx/include/__format/parser_std_format_spec.h:758
+
+/**
+ * Unicode column width estimates.
----------------
Just a general comment -- we generally try to use C++ style comments (`//`) elsewhere in the library. I don't have the heart to ask you to change them cause it's such a nitpick, but please consider using that style going forward for consistency.


================
Comment at: libcxx/include/__format/parser_std_format_spec.h:809
+template <class _CharT>
+concept __utf8 = same_as<_CharT, char> || same_as<_CharT, char8_t>;
+
----------------
Would it make sense to name those `__utf8_char` or `__utf8_character` instead? Just `__utf8` seems kind of obtuse to me. (applies below too)


================
Comment at: libcxx/include/__format/parser_std_format_spec.h:876
+ */
+_LIBCPP_HIDE_FROM_ABI constexpr int __column_width(uint32_t __c) noexcept {
+  if (__c < 0x1'0000)
----------------
This needs to be `inline` since it's not a template and it's in a header. This certainly applies elsewhere too, please take a look. Otherwise, those are ODR violations waiting to happen (arguably those ODR violations are benign, but still).


================
Comment at: libcxx/test/libcxx/utilities/format/format.string/format.string.std/std_format_spec_string_unicode.pass.cpp:1
+//===----------------------------------------------------------------------===//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
----------------
I would suggest splitting this into two files and using the `libcpp-has-no-unicode` Lit feature to enable/disable them. That sounds a bit cleaner since essentially the whole test is different depending on whether unicode is supported. Your call, this is not a blocking comment.


================
Comment at: libcxx/test/libcxx/utilities/format/format.string/format.string.std/std_format_spec_string_unicode.pass.cpp:50
+
+#ifndef _LIBCPP_HAS_NO_UNICODE
+template <class CharT>
----------------
It just occurred to me that I am confused by the existence of both `_LIBCPP_HAS_NO_UNICODE` and `_LIBCPP_HAS_NO_UNICODE_CHARS` -- can you shed some light on this if it's fresh in your mind?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103413/new/

https://reviews.llvm.org/D103413



More information about the libcxx-commits mailing list