[libcxx-commits] [libcxx] [libcxx] Handle changing of fr_FR locale thousand sep in Windows libcxx tests (PR #83967)

via libcxx-commits libcxx-commits at lists.llvm.org
Mon Mar 4 23:49:32 PST 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-libcxx

Author: Rodrigo Salazar (4-rodrigo-salazar)

<details>
<summary>Changes</summary>

libcxx tests which assume the thousands separator for fr_FR locale is x00A0 on windows currently fail when run on newer versions of windows (It seems to have been updated to the new correct value from 0x00A0 to 0x202F around windows 11. The exact windows version where it changed doesn't seem to be documented anywhere.). 

This PR updates the tests to not hard-code the separator but instead find the host's fr_FR separator to set the right expectation in the tests that assume this. It uses 2 new functions in test/support/locale_helpers.h to do this: `get_locale_thousands_sep` and `get_locale_mon_thousands_sep`.

This fixes 5 tests under windows.

-----

Testing notes:

Tested with toolsets:
- windows llvm-mingw-20231128-ucrt-x86_64  (note, msvcrt has dozens of pre-existing locale failures). 
- msvc 2022

For both toolsets above I see the 5 tests passing now, whereas they were failing previously (and no new failures in overall check-cxx).

- Also ran check-cxx tests with C++03 on the above windows toolsets.


Individual test commands:
```
python ./bin/llvm-lit.py --filter std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp libcxx/test -v -a

python ./bin/llvm-lit.py --filter std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp libcxx/test -v -a

python ./bin/llvm-lit.py --filter std/time/time.duration/time.duration.nonmember/ostream.pass.cpp libcxx/test -v -a

python ./bin/llvm-lit.py --filter std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp libcxx/test -v -a

python ./bin/llvm-lit.py --filter std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp libcxx/test -v -a
```

---
Full diff: https://github.com/llvm/llvm-project/pull/83967.diff


6 Files Affected:

- (modified) libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp (+1-1) 
- (modified) libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp (+1-1) 
- (modified) libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp (+9-1) 
- (modified) libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp (+9-1) 
- (modified) libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp (+10-4) 
- (modified) libcxx/test/support/locale_helpers.h (+86-3) 


``````````diff
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
index 3effa80e7d6f79..1b594d8c5b488a 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
@@ -54,7 +54,7 @@ class my_facetw
 };
 
 static std::wstring convert_thousands_sep(std::wstring const& in) {
-  return LocaleHelpers::convert_thousands_sep_fr_FR(in);
+  return LocaleHelpers::convert_mon_thousands_sep_fr_FR(in);
 }
 #endif // TEST_HAS_NO_WIDE_CHARACTERS
 
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp
index 05b4ee474944af..02112e06c4a122 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp
@@ -54,7 +54,7 @@ class my_facetw
 };
 
 static std::wstring convert_thousands_sep(std::wstring const& in) {
-  return LocaleHelpers::convert_thousands_sep_fr_FR(in);
+  return LocaleHelpers::convert_mon_thousands_sep_fr_FR(in);
 }
 #endif // TEST_HAS_NO_WIDE_CHARACTERS
 
diff --git a/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp b/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp
index 2a70741d2a0fa6..856598a5c45b66 100644
--- a/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/category.monetary/locale.moneypunct.byname/thousands_sep.pass.cpp
@@ -29,6 +29,10 @@
 #include "test_macros.h"
 #include "platform_support.h" // locale name macros
 
+#if defined(_WIN32) && !defined(TEST_HAS_NO_WIDE_CHARACTERS)
+#include "locale_helpers.h"
+#endif
+
 class Fnf
     : public std::moneypunct_byname<char, false>
 {
@@ -115,7 +119,11 @@ int main(int, char**)
 #if defined(_CS_GNU_LIBC_VERSION)
     const wchar_t fr_sep = glibc_version_less_than("2.27") ? L' ' : L'\u202F';
 #elif defined(_WIN32)
-    const wchar_t fr_sep = L'\u00A0';
+    // Windows has changed it's fr thousands sep between releases.
+    // Fetch the host's separator in order to know what to expect from the test results.
+    const std::wstring fr_sep_s = LocaleHelpers::get_locale_mon_thousands_sep(LOCALE_fr_FR_UTF_8);
+    assert(fr_sep_s.size() == 1);
+    const wchar_t fr_sep = fr_sep_s[0];
 #elif defined(_AIX)
     const wchar_t fr_sep = L'\u202F';
 #else
diff --git a/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp b/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
index d7e1178c92e041..0c0b04ab710067 100644
--- a/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
+++ b/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
@@ -28,6 +28,10 @@
 #include "test_macros.h"
 #include "platform_support.h" // locale name macros
 
+#if defined(_WIN32) && !defined(TEST_HAS_NO_WIDE_CHARACTERS)
+#include "locale_helpers.h"
+#endif
+
 int main(int, char**)
 {
     {
@@ -78,7 +82,11 @@ int main(int, char**)
 #if defined(_CS_GNU_LIBC_VERSION)
             const wchar_t wsep = glibc_version_less_than("2.27") ? L' ' : L'\u202f';
 #elif defined(_WIN32)
-            const wchar_t wsep = L'\u00A0';
+            // Windows has changed it's fr thousands sep between releases.
+            // Fetch the host's separator in order to know what to expect from the test results.
+            const std::wstring wsep_s = LocaleHelpers::get_locale_mon_thousands_sep(LOCALE_fr_FR_UTF_8);
+            assert(wsep_s.size() == 1);
+            const wchar_t wsep = wsep_s[0];
 #else
             const wchar_t wsep = L',';
 #endif
diff --git a/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp b/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp
index e5d11ab4672bdf..db318e3db4e84f 100644
--- a/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp
+++ b/libcxx/test/std/time/time.duration/time.duration.nonmember/ostream.pass.cpp
@@ -36,6 +36,10 @@
 #include "platform_support.h" // locale name macros
 #include "test_macros.h"
 
+#if defined(_WIN32) && !defined(TEST_HAS_NO_WIDE_CHARACTERS)
+#include "locale_helpers.h"
+#endif
+
 #define SV(S) MAKE_STRING_VIEW(CharT, S)
 
 template <class CharT, class Rep, class Period>
@@ -89,10 +93,12 @@ static void test_values() {
 #endif
   } else {
 #ifdef _WIN32
-    assert(stream_fr_FR_locale<CharT>(-1'000'000s) == SV("-1\u00A0000\u00A0000s"));
-    assert(stream_fr_FR_locale<CharT>(1'000'000s) == SV("1\u00A0000\u00A0000s"));
-    assert(stream_fr_FR_locale<CharT>(-1'000.123456s) == SV("-1\u00A0000,1235s"));
-    assert(stream_fr_FR_locale<CharT>(1'000.123456s) == SV("1\u00A0000,1235s"));
+    std::wstring expected_sep = LocaleHelpers::get_locale_thousands_sep(LOCALE_fr_FR_UTF_8);
+    assert(expected_sep.size() == 1);
+    assert(stream_fr_FR_locale<CharT>(-1'000'000s) == LocaleHelpers::convert_thousands_sep(L"-1 000 000s", expected_sep[0]));
+    assert(stream_fr_FR_locale<CharT>(1'000'000s) == LocaleHelpers::convert_thousands_sep(L"1 000 000s", expected_sep[0]));
+    assert(stream_fr_FR_locale<CharT>(-1'000.123456s) == LocaleHelpers::convert_thousands_sep(L"-1 000,1235s", expected_sep[0]));
+    assert(stream_fr_FR_locale<CharT>(1'000.123456s) == LocaleHelpers::convert_thousands_sep(L"1 000,1235s", expected_sep[0]));
 #elif defined(__APPLE__)
     assert(stream_fr_FR_locale<CharT>(-1'000'000s) == SV("-1000000s"));
     assert(stream_fr_FR_locale<CharT>(1'000'000s) == SV("1000000s"));
diff --git a/libcxx/test/support/locale_helpers.h b/libcxx/test/support/locale_helpers.h
index 3eb24ebf28f524..9b18acc13cbbd8 100644
--- a/libcxx/test/support/locale_helpers.h
+++ b/libcxx/test/support/locale_helpers.h
@@ -41,11 +41,90 @@ std::wstring convert_thousands_sep(std::wstring const& in, wchar_t sep) {
   return out;
 }
 
+#if defined(_WIN32)
+// This implementation is similar to the locale_guard in the private libcxx implementation headers
+// but exists here for usability from the libcxx/test/std conformance test suites.
+class LocaleGuard {
+public:
+  explicit LocaleGuard(const char* locale_in) : status_(_configthreadlocale(_ENABLE_PER_THREAD_LOCALE)) {
+    assert(status_ != -1);
+    // Setting the locale can be expensive even when the locale given is
+    // already the current locale, so do an explicit check to see if the
+    // current locale is already the one we want.
+    const char* curr_locale = set_locale_asserts(nullptr);
+    // If every category is the same, the locale string will simply be the
+    // locale name, otherwise it will be a semicolon-separated string listing
+    // each category.  In the second case, we know at least one category won't
+    // be what we want, so we only have to check the first case.
+    if (std::strcmp(locale_in, curr_locale) != 0) {
+      locale_all_ = _strdup(curr_locale);
+      assert(locale_all_ != nullptr);
+      set_locale_asserts(locale_in);
+    }
+  }
+
+  ~LocaleGuard() {
+    // The CRT documentation doesn't explicitly say, but setlocale() does the
+    // right thing when given a semicolon-separated list of locale settings
+    // for the different categories in the same format as returned by
+    // setlocale(LC_ALL, nullptr).
+    if (locale_all_ != nullptr) {
+      set_locale_asserts(locale_all_);
+      free(locale_all_);
+    }
+    _configthreadlocale(status_);
+  }
+
+private:
+  static const char* set_locale_asserts(const char* locale_in) {
+    const char* new_locale = setlocale(LC_ALL, locale_in);
+    assert(new_locale != nullptr);
+    return new_locale;
+  }
+
+  int status_;
+  char* locale_all_ = nullptr;
+};
+
+template <typename T>
+std::wstring get_locale_lconv_cstr_member(const char* locale, T lconv::*cstr_member) {
+  // Store and later restore current locale
+  LocaleGuard g(locale);
+
+  char* locale_set = setlocale(LC_ALL, locale);
+  assert(locale_set != nullptr);
+  lconv* lc            = localeconv();
+  const char* selected = lc->*cstr_member;
+  if (selected == nullptr) {
+    // member is empty string on the locale
+    return std::wstring();
+  }
+
+  std::size_t len = std::mbsrtowcs(nullptr, &selected, 0, nullptr);
+  assert(len != static_cast<std::size_t>(-1));
+
+  std::wstring ws_out(len, L'\0');
+  std::mbstate_t mb = {};
+  std::size_t ret   = std::mbsrtowcs(&ws_out[0], &selected, len, &mb);
+  assert(ret != static_cast<std::size_t>(-1));
+
+  return ws_out;
+}
+
+std::wstring get_locale_mon_thousands_sep(const char* locale) {
+  return get_locale_lconv_cstr_member(locale, &lconv::mon_thousands_sep);
+}
+
+std::wstring get_locale_thousands_sep(const char* locale) {
+  return get_locale_lconv_cstr_member(locale, &lconv::thousands_sep);
+}
+#endif // _WIN32
+
 // GLIBC 2.27 and newer use U+202F NARROW NO-BREAK SPACE as a thousands separator.
 // This function converts the spaces in string inputs to U+202F if need
 // be. FreeBSD's locale data also uses U+202F, since 2018.
-// Windows uses U+00A0 NO-BREAK SPACE.
-std::wstring convert_thousands_sep_fr_FR(std::wstring const& in) {
+// Windows may use U+00A0 NO-BREAK SPACE or U+0202F NARROW NO-BREAK SPACE.
+std::wstring convert_mon_thousands_sep_fr_FR(std::wstring const& in) {
 #if defined(_CS_GNU_LIBC_VERSION)
   if (glibc_version_less_than("2.27"))
     return in;
@@ -54,7 +133,11 @@ std::wstring convert_thousands_sep_fr_FR(std::wstring const& in) {
 #elif defined(__FreeBSD__)
   return convert_thousands_sep(in, L'\u202F');
 #elif defined(_WIN32)
-  return convert_thousands_sep(in, L'\u00A0');
+  // Windows has changed it's fr thousands sep between releases,
+  // so we find the host's separator instead of hard-coding it.
+  std::wstring fr_sep_s = get_locale_mon_thousands_sep(LOCALE_fr_FR_UTF_8);
+  assert(fr_sep_s.size() == 1);
+  return convert_thousands_sep(in, fr_sep_s[0]);
 #else
   return in;
 #endif

``````````

</details>


https://github.com/llvm/llvm-project/pull/83967


More information about the libcxx-commits mailing list