[libcxx-commits] [libcxx] [libc++] Speed up classic locale (PR #70631)

Mon Nov 13 00:53:54 PST 2023

================
@@ -14,17 +15,72 @@ double istream_numbers() {
   double f1 = 0.0, f2 = 0.0, q = 0.0;
   for (int i = 0; i < 3; i++) {
     std::istringstream s(a[i]);
+    if (l)
+      s.imbue(*l);
     s >> a1 >> a2 >> a3 >> f1 >> a4 >> a5 >> f2 >> a6 >> a7;
     q += (a1 + a2 + a3 + a4 + a5 + a6 + a7 + f1 + f2) / 1000000;
   }
   return q;
 }
 
+struct LocaleSelector {
+  std::locale* imbue;
+  std::locale old;
+
+  LocaleSelector(benchmark::State& state) {
+    static std::mutex mu;
+    std::lock_guard l(mu);
+    switch (state.range(0)) {
+    case 0: {
----------------
dvyukov wrote:

It's all different benchamarks:
```
Ostream_number/0/real_time/threads:1                   184.0n ± 1%   135.0n ± 1%  -26.63% (p=0.000 n=24)
Ostream_number/0/real_time/threads:72                17279.0n ± 4%   308.0n ± 1%  -98.22% (p=0.000 n=24)
Ostream_number/1/real_time/threads:1                   253.5n ± 1%   199.0n ± 1%  -21.50% (p=0.000 n=24)
Ostream_number/1/real_time/threads:72                13957.5n ± 2%   382.0n ± 0%  -97.26% (p=0.000 n=24)
Ostream_number/2/real_time/threads:1                   253.0n ± 1%   202.0n ± 1%  -20.16% (p=0.000 n=24)
Ostream_number/2/real_time/threads:72                  28.04µ ± 4%   18.80µ ± 7%  -32.96% (p=0.000 n=24)
Ostream_number/3/real_time/threads:1                   187.0n ± 1%   190.5n ± 0%   +1.87% (p=0.000 n=24)
Ostream_number/3/real_time/threads:72                  20.25µ ± 7%   20.37µ ± 5%        ~ (p=0.736 n=24)
```

Or you mean copy-pasting and specializing the top benchmark function for each case?
Do you think it will be better? I would assume it will increase code size and add some duplication.

https://github.com/llvm/llvm-project/pull/70631