[libc-commits] [libc] [libc] Optimize BigInt→decimal in IntegerToString (PR #123580)
via libc-commits
libc-commits at lists.llvm.org
Tue Jan 21 06:52:57 PST 2025
================
@@ -164,6 +164,168 @@ template <size_t radix> using Custom = details::Fmt<radix>;
} // namespace radix
+// Extract the low-order decimal digit from a value of integer type T. The
+// returned value is the digit itself, from 0 to 9. The input value is passed
+// by reference, and modified by dividing by 10, so that iterating this
+// function extracts all the digits of the original number one at a time from
+// low to high.
+template <typename T, cpp::enable_if_t<cpp::is_integral_v<T>, int> = 0>
+LIBC_INLINE uint8_t extract_decimal_digit(T &value) {
+ const uint8_t digit(static_cast<uint8_t>(value % 10));
+ // For built-in integer types, we assume that an adequately fast division is
+ // available. If hardware division isn't implemented, then with a divisor
+ // known at compile time the compiler might be able to generate an optimized
+ // sequence instead.
+ value /= 10;
+ return digit;
+}
+
+// A specialization of extract_decimal_digit for the BigInt type in big_int.h,
+// avoiding the use of general-purpose BigInt division which is very slow.
+template <typename T, cpp::enable_if_t<is_big_int_v<T>, int> = 0>
+LIBC_INLINE uint8_t extract_decimal_digit(T &value) {
+ // There are two essential ways you can turn n into (n/10,n%10). One is
+ // ordinary integer division. The other is a modular-arithmetic approach in
+ // which you first compute n%10 by bit twiddling, then subtract it off to get
+ // a value that is definitely a multiple of 10. Then you divide that by 10 in
+ // two steps: shift right to divide off a factor of 2, and then divide off a
+ // factor of 5 by multiplying by the modular inverse of 5 mod 2^BITS. (That
+ // last step only works if you know there's no remainder, which is why you
+ // had to subtract off the output digit first.)
+ //
+ // Either approach can be made to work in linear time. This code uses the
+ // modular-arithmetic technique, because the other approach either does a lot
+ // of integer divisions (requiring a fast hardware divider), or else uses a
+ // "multiply by an approximation to the reciprocal" technique which depends
+ // on careful error analysis which might go wrong in an untested edge case.
+
+ using Word = typename T::word_type;
+
+ // Find the remainder (value % 10). We do this by breaking up the input
+ // integer into chunks of size WORD_SIZE/2, so that the sum of them doesn't
+ // overflow a Word. Then we sum all the half-words times 6, except the bottom
+ // one, which is added to that sum without scaling.
+ //
+ // Why 6? Because you can imagine that the original number had the form
+ //
+ // halfwords[0] + K*halfwords[1] + K^2*halfwords[2] + ...
+ //
+ // where K = 2^(WORD_SIZE/2). Since WORD_SIZE is expected to be a multiple of
+ // 8, that makes WORD_SIZE/2 a multiple of 4, so that K is a power of 16. And
+ // all powers of 16 (larger than 1) are congruent to 6 mod 10, by induction:
+ // 16 itself is, and 6^2=36 is also congruent to 6.
+ Word acc_remainder = 0;
+ const Word HALFWORD_BITS = T::WORD_SIZE / 2;
+ const Word HALFWORD_MASK = ((Word(1) << HALFWORD_BITS) - 1);
----------------
lntue wrote:
`constexpr`?
https://github.com/llvm/llvm-project/pull/123580
More information about the libc-commits
mailing list