[libc-commits] [llvm] [libc] [libc] Move printf long double to simple calc (PR #75414)

Tue Jan 2 11:36:02 PST 2024

================
@@ -87,14 +87,25 @@ are not recommended to be adjusted except by persons familiar with the Printf
 Ryu Algorithm. Additionally they have no effect when float conversions are
 disabled.
 
+LIBC_COPT_FLOAT_TO_STR_NO_SPECIALIZE_LD
+---------------------------------------
+This flag disables the separate long double conversion implementation. It is
+not based on the Ryu algorithm, instead generating the digits by
+multiplying/dividing the written-out number by 10^9 to get blocks. It's
+significantly faster than INT_CALC, only about 10x slower than MEGA_TABLE,
----------------
michaelrj-google wrote:

The table is accessed sequentially. It's a flattened two-dimensional table, where the `POW10_OFFSET` array gives you the starting indexes of each value in the larger array `POW10_SPLIT`. The negative indices have a similar but slightly different layout. Each number will map to one of the `POW10_OFFSET` values, then have an offset from that into the larger array which will be incremented or decremented (depending on if this is the positive or negative exponent table).

The 5MB table is already compressed. If you look at the top of `ryu_long_double_constants.h` you'll see `TABLE_SHIFT_CONST`, `IDX_SIZE`, and `MID_INT_SIZE`. These are the constants we can adjust to try to shrink the table. There's a more comprehensive explanation of what these mean in `utils/mathtools/ryu_tablegen.py` but here's the short version:

`MID_INT_SIZE` is the size of each entry in the `POW10_SPLIT` array. This needs to be at least `TABLE_SHIFT_CONST + IDX_SIZE + sizeof(BLOCK_INT)` so that it can actually fit the values.

`TABLE_SHIFT_CONST` (called `CONSTANT` in `ryu_tablegen.py`) adjusts the precision of each entry in the array, with higher being more precise. It's 120 here because anything lower tends to cause errors to get into the result.

`IDX_SIZE` is the compression factor. It's the number of exponents that can be mapped onto one entry in `POW10_OFFSET`. From my testing it only works when it's a power of 2. Pushing this higher would require increasing `MID_INT_SIZE` a lot, which would significantly reduce the actual size savings, and would also make the calculations slower.

I talked a bit with Ulf Adams (the original designer of the Ryu algorithm) and he suggested that it might be possible to also compress the table by approximating the next value by multiplying by `5**9`. In my testing this worked, but required a higher `TABLE_SHIFT_CONST` since you lose some precision with each approximation. In the end this only shrank the table a bit, and so I decided it wasn't worthwhile.

In conclusion, I believe the table is already compressed within an order of magnitude of its minimum size, and that makes it too large to be practical. Long doubles are rarely used, so carrying a large table all the time to speed them up seems like a bad idea. 

https://github.com/llvm/llvm-project/pull/75414