[libcxx-commits] [PATCH] D146398: [libcxx] Fix using std::wcout/wcin on Windows with streams configured in wide mode

Tom Honermann via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Fri May 12 17:58:53 PDT 2023


tahonermann added inline comments.


================
Comment at: libcxx/docs/UsingLibcxx.rst:534-538
+If the mode of the standard streams ``stdout``, ``stderr`` or ``stdin`` is
+changed to Unicode mode with ``_setmode()``, all interaction with those streams
+must happen in Unicode mode. This means that ``std::cout``, ``std::cerr`` or
+``std::cin`` respectively can't be used at that point, only ``std::wcout``,
+``std::wcerr`` or ``std::wcin``.
----------------
Microsoft's [[ https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170 | documentation for `_setmode()` ]] states that use of a narrow print function on a Unicode stream will result in an assertion failure (when linked with a debug C run-time library) and that is the case for `printf()` and `putc()`. However, it isn't the case for Microsoft's implementation of `std::cout` and friends; they misbehave (the narrow `char` buffer gets interpreted as holding a sequence of `wchar_t` which, of course, does nothing useful), but they don't assert. It seems they must bypass the C functions that assert; probably by calling `fwrite()` (which doesn't assert).

I'm wondering if we should say more about what the ramifications are if the narrow streams are used in Unicode mode; saying "can't be used" doesn't communicate much. The suggested edit attempts to better explain what happens.


================
Comment at: libcxx/docs/UsingLibcxx.rst:540-542
+The same goes if a custom locale is imbued on the wide char streams; at that
+point, all actual IO towards the stream is made with regular narrow chars,
+which doesn't work if the respective stream has been set to Unicode mode.
----------------
I wonder if we should say something about `std::ios::sync_with_stdio()` here. Microsoft's implementation requires `sync_with_stdio(false)` for an imbued locale to have any effect. Is libc++ consistent with that requirement? If not, perhaps it should be?


================
Comment at: libcxx/src/std_stream.h:139
+#ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
+static bool __do_ungetc(std::wint_t __c, FILE *__fp) {
+    if (ungetwc(__c, __fp) == WEOF)
----------------
mstorsjo wrote:
> I'm a bit unsure about whether this is ok to do; can we generally assume that `std::wint_t` and `int` are distinct different data types? If we'd just pass the raw `char`/`wchar_t` here, we can't access `traits_type::to_int_type` here. Or should we do that and directly use `char_traits<type>::to_int_type` explicitly instead of going via the `traits_type` typedef within the class?
Good catch; I don't think we can depend on them being distinct types. It looks like there are cases where an "int_type" value is passed, so I think we should pass the right type. Here are a few other suggestions to choose from. Since this is pretty isolated code, I don't have strong opinions at all. Just make sure the intent is either obvious in the code or add a comment otherwise.
- Pass a dummy argument solely for overload resolution.
- Make these a template and pass the character type as a template parameter.


================
Comment at: libcxx/src/std_stream.h:327
+static bool __do_fputc(char __c, FILE* __fp) {
+    if (fwrite(&__c, sizeof(__c), 1, __fp) != 1)
+        return false;
----------------



================
Comment at: libcxx/src/std_stream.h:396-398
+    // For wchar_t on Windows, don't do fwrite(), but write characters one
+    // at a time with fputwc; that works both when stdout is in the default
+    // mode and when it is set to unicode mode.
----------------



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146398/new/

https://reviews.llvm.org/D146398



More information about the libcxx-commits mailing list