[libcxx-commits] [PATCH] D64818: [libc++] Implement missing filesystem::path constructors with locale
Tom Honermann via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Wed Nov 1 13:17:34 PDT 2023
tahonermann requested changes to this revision.
tahonermann added inline comments.
This revision now requires changes to proceed.
================
Comment at: libcxx/include/__locale:1566-1579
+ while (__nb < __ne && __r != codecvt_base::error) {
+ static const int __sz = 32;
+ _InternT __buf[__sz];
+ _InternT* __bn;
+ const char* __nn = __nb;
+ __r = __cvt.in(__mb, __nb, __ne - __nb > __sz ? __nb + __sz : __ne, __nn, __buf, __buf + __sz, __bn);
+ if (__r == codecvt_base::error || __nn == __nb)
----------------
This can end up in an infinite loop when the narrow input string ends in a partial code unit sequence. In that case, `partial` will be returned, but `from_next` (aka `__nn`) will be left pointing to the beginning of the partial sequence such that the next iteration will encounter the same partial result.
See https://godbolt.org/z/3q5xaG5dW for an example. The test there reliably exhibits an infinite loop for `std::codecvt<char32_t, char, std::mbstate_t>`. For the `std::codecvt<wchar_t, char, std::mbstate_t>` facet, the infinite loop is avoided because the converter packs partial state into `std::mbstate_t` and then (erroneously) returns `ok` instead of `partial` (erroneously because no character is ever translated, but no error is ever issued either).
These appear to be existing bug given that this code was just factored out from other locations.
================
Comment at: libcxx/test/std/input.output/filesystems/class.path/path.member/path.construct/source_and_locale.pass.cpp:129
+ }
+
+ for (auto const& MS : PathList) {
----------------
I think additional tests should be added here to exercise various edge cases like the truncated partial code unit sequence in the godbolt link I added in another comment. We should also exercise some MxN sequence conversions (e.g., UTF-8 to UTF-16). That can be done by calling `.u16string()` on the resulting `path` object and comparing what it returns to expectations.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D64818/new/
https://reviews.llvm.org/D64818
More information about the libcxx-commits
mailing list