[libcxx-commits] [PATCH] D97705: [RFC][libc++] Improve std::to_chars for base != 10.

Mon Mar 1 10:13:03 PST 2021

Mordante created this revision.
Mordante added reviewers: EricWF, ldionne, mclow.lists, miscco, curdeius.
Mordante requested review of this revision.
Herald added a project: libc++.
Herald added a subscriber: libcxx-commits.
Herald added a reviewer: libc++.

While working on D70631 <https://reviews.llvm.org/D70631> Microsoft's unit tests discovered an issue. This
is an RFC proposal to look at the direction we want to take to solve the
issue. The code is just a proof-of-concept and not intended to be the
final version.

I noticed our `to_chars` implementation for bases != 10 uses the range
[first, last) as temporary buffer. This violates the contract for
to_chars:
[charconv.to.chars]/1 http://eel.is/c++draft/charconv#to.chars-1
`to_chars_result to_chars(char* first, char* last, see below value, int base = 10);`
"If the member ec of the return value is such that the value is equal to
the value of a value-initialized errc, the conversion was successful and
the member ptr is the one-past-the-end pointer of the characters
written."

Our implementation modifies the range `[member ptr, last)`, which causes
Microsoft's test to fail. Their test verifies the buffer
`[member ptr, last)` is unchanged. (The test is only done when the
conversion is successful.)

I've some solutions to fix this issue, in basics the solutions all do
the same:

- Determine the required output length.
- If the output is too small return an error.
- Else write the data and return success.

I've made 4 benchmarks:

- `to_chars` the current implementation,
- `to_chars_inplace` a fix using the base as function argument,
- `to_chars_inplace_multiplexed` a fix using the base as template argument,
- `to_chars_inplace_multiplexed_for_format` a combination of the two, bases 2, 8, 16 using template arguments the other function arguments.

The implementation of base 10 hasn't been modified.

The sizes of the 4 benchmarks on an amd64 are:

  text	   data	    bss	    dec	    hex	filename

226129	   2620	   4936	 233685	  390d5	to_chars.libcxx.out
 226257	   2620	   4936	 233813	  39155	to_chars_inplace.libcxx.out
 226641	   2620	   4936	 234197	  392d5	to_chars_inplace_multiplexed_for_format.libcxx.out
 232289	   2620	   4936	 239845	  3a8e5	to_chars_inplace_multiplexed.libcxx.out

The results of the benchmarks are available in this patch. Based on
these results it's clear the multiplexed version has a better
performance at the cost of about 6KB of additional code size. At 0.5KB
we can at least make 2, 8, 16 faster, these bases are common and will be
available in `std::format`.

The inplace version is slower, but faster if the buffer is too small. I
expect that to be a rare case so I don't think these gains are too
important. I didn't test our current algorithm with a stack buffer, but
would expect similar performance as our current algorithm.

Proposed solutions:

1. Use multiplexed for all cases.
2. Use multiplexed for 2, 8, 16 and the others should use the original algorithm, but using a temporary stack buffer.
3. Investigate the effects of inlining the multiplexed versions for 2, 8, and 16 and the others as external template in charconv.cpp. (Leaving base10 unchanged.)

I would propose to use option 3.

When proceeding some additional improvements should be investigated:

- Determining the required number of digits for the values with a power of 2 can probably done faster with bit operations.
- The base 10 version is slower, I expect this to be fixed by using an if statement instead of a case statement.
- There's no code size optimization to share 8, 16, and 32 bit code like it has been done for base 10.
- The base 2 and base 8 might benefit by processing 16 bits per iteration.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D97705

Files:
  libcxx/benchmarks/to_chars.bench.cpp
  libcxx/benchmarks/to_chars_inplace.bench.cpp
  libcxx/benchmarks/to_chars_inplace_multiplexed.bench.cpp
  libcxx/benchmarks/to_chars_inplace_multiplexed_for_format.bench.cpp
  libcxx/include/charconv
  libcxx/to_chars.txt
  libcxx/to_chars_inplace.txt
  libcxx/to_chars_inplace_multiplexed.txt
  libcxx/to_chars_inplace_multiplexed_for_format.txt

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D97705.327170.patch
Type: text/x-patch
Size: 41839 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20210301/0d1add72/attachment-0001.bin>