[libcxx-commits] [PATCH] D146398: [libcxx] Fix using std::wcout/wcin on Windows with streams configured in wide mode
Martin Storsjö via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Thu Apr 27 05:16:29 PDT 2023
mstorsjo added a comment.
In D146398#4285271 <https://reviews.llvm.org/D146398#4285271>, @tahonermann wrote:
> In D146398#4282999 <https://reviews.llvm.org/D146398#4282999>, @mstorsjo wrote:
>
>> In D146398#4281670 <https://reviews.llvm.org/D146398#4281670>, @tahonermann wrote:
>>
>>> Hmm, is it really standards conforming to not use the `std::codecvt<wchar_t, char, std::mbstate_t` locale facet for `std::wcout` and friends? We can make an argument that a program that contains a call to `_setmode()` doesn't require conformance, but in the absence of such a call, I would think that we are required to use the imbued facet. As is, with these changes, if a user were to imbue their own facet, it would get ignored. Do you know how the Microsoft or libstdcxx implementations handle that?
>>
>> So with that in mind, I guess we should try to keep the current codepaths that do conversions but either avoid hitting them or keeping the original wchar form around if `noconv` is returned... I guess I should sit down with the MS STL implementation and see if I can trace what codepaths it ends up using for e.g. wcout.
>
> I would expect that `noconv` will never be true when the internal and external character types differ since, at a minimum, a (potentially narrowing) copy has to be performed.
I dug into this; the main difference seems to be this:
- In STL, there's a `basic_filebuf` initialized here: https://github.com/microsoft/STL/blob/6d2f8b0ed88ea6cba26cc2151f47f678442c1663/stl/inc/fstream#L174-L176 Which calls `_Init`: https://github.com/microsoft/STL/blob/6d2f8b0ed88ea6cba26cc2151f47f678442c1663/stl/inc/fstream#L703-L728 Which sets `_Pcvt` to null. So by default there's no conversion set up. If the user calls `imbue`, https://github.com/microsoft/STL/blob/6d2f8b0ed88ea6cba26cc2151f47f678442c1663/stl/inc/fstream#L698-L701, it does set up `_Pcvt`: https://github.com/microsoft/STL/blob/6d2f8b0ed88ea6cba26cc2151f47f678442c1663/stl/inc/fstream#L766-L773
- In libc++, we always set up `__cv_` based on `this->getloc()`: https://github.com/llvm/llvm-project/blob/fe3ed6550ef277f5c47a9ccdbdd466fa225f4a99/libcxx/src/std_stream.h#L251-L258 https://github.com/llvm/llvm-project/blob/fe3ed6550ef277f5c47a9ccdbdd466fa225f4a99/libcxx/src/std_stream.h#L76
So, would a conformant way forward be this?
- Keep doing whatever we're doing now, on Unix platforms
- On Windows, don't imbue `this->getloc()` by default, but default to `always_noconv`, but keep the conversion codepaths alive. If the user calls `imbue()`, we'd activate the `__cv_` converter
- Make sure that we have a valid fastpath for `always_noconv` that preserves data in `char_type` without munging it via a `char` buffer, for both input and output? (We do have some such fastpaths, but they need to be tweaked to run e.g. `fputwc` repeatedly instead of doing one large `fwrite`.)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D146398/new/
https://reviews.llvm.org/D146398
More information about the libcxx-commits
mailing list