[libcxx-commits] [PATCH] D134036: [libc++][format] Implements string escaping.
Mark de Wever via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Oct 17 11:04:28 PDT 2022
Mordante marked 6 inline comments as done.
Mordante added a comment.
Thanks for the review @tahonermann. Seems I didn't submit my replies before.
================
Comment at: libcxx/include/__format/formatter_output.h:446
+ switch (__value) {
+ case _CharT('\t'):
+ __str += _LIBCPP_STATICALLY_WIDEN(_CharT, "\\t");
----------------
tahonermann wrote:
> tahonermann wrote:
> > I think these case statements should use `_LIBCPP_STATICALLY_WIDEN` or similar as there is no guarantee that the `wchar_t` value for `\t` is the same as for `char`.
> Any further thoughts on this? Or is there an existing libc++ policy that requires that characters that can be represented in a single code unit in both the ordinary and wide literal encodings must have the same value? If there is, I expect that policy will have to be revisited if `-fexec-charset` is expanded to include additional encodings like IBM-1047 as was proposed in D93031.
Thanks for the additional link. The assumption `_CharT('\t')` that works correctly is hard-coded in several places in libc++ (not limited to the format library). I think it's good to address this, but I prefer to do it in a separate review. I really want to look at how we're going to do that properly and properly test it.
(I expect we need a CI job where we use different encodings for `char` and `wchar_t`.)
================
Comment at: libcxx/include/__format/formatter_output.h:517-519
+ _LIBCPP_ASSERT(__result.__ill_formed_size == 1, "for UTF-16 at most two code units per code point.");
+ for (int __i = 0; __i < __result.__ill_formed_size; __result.__value >>= 16, (void)++__i)
+ __formatter::__write_escape_ill_formed_code_unit(__str, __result.__value & 0xffff);
----------------
tahonermann wrote:
> tahonermann wrote:
> > `__consume_p2286()` has a condition where a `__consume_p2286_result` value is returned with `__ill_formed_size` equal to 2 that would appear to violate the assertion here. The assertion of `1` is also at odds with the comment and the need for the following loop. I suspect this is intended to assert a value of either `1` or `2`. If so, this implies a missing test case.
> The changes made address most of the concern I raised, but if `__ill_formed_size` can only be `1`, then there is no need for the `for` loop.
Thanks!
================
Comment at: libcxx/test/std/utilities/format/format.functions/escaped_output.ascii.pass.cpp:34
+
+auto test_format = []<class CharT, class... Args>(
+ std::basic_string_view<CharT> expected, test_format_string<CharT, Args...> fmt, Args&&... args) {
----------------
Mordante wrote:
> As mentioned in D134031 this should be a function and not a lambda.
Edit that won't work in these kind of tests.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134036/new/
https://reviews.llvm.org/D134036
More information about the libcxx-commits
mailing list