[libcxx-commits] [libcxx] linear_congruential_engine: add using more precision to prevent overflow (PR #81583)

Tue Feb 13 01:27:04 PST 2024

LRFLEW wrote:

Ok, so here's the comment about the missing edge-case.

Performing 128-bit arithmetic on platforms without 128-bit integer types (which is mainly platforms with 32-bit registers) can get very complex. It's all very possible, with Art of Programming Volume 2 seeming to be a popular source for the methods to use (IIRC, MSVC's STL references it in the source code it uses to handle this particular case), but it will still be rather involved.

>From what I've seen, there's kind of three notable options for handling this:

 * With `_BitInt(N)` available in clang, and making it into the C23 standard (meaning other compilers will have to implement it), it might just be a matter of waiting for the other supported compilers to implement it. `_BitInt(N)`. There's a lot of possible pitfalls with this, because the other compilers may not make it available in C++, or choose to make `BITINT_MAXWIDTH` the same as `ULLONG_WIDTH ` (i.e. not providing 128-bit integers) on 32-bit platforms. However, if using `_BitInt(N)` works in all the supported compilers, then the entire implementation of `__lce_ta` could be dramatically simplified by merging all the cases of different precision.
 * There are probably other cases where having access to 128-bit+ arithmetic might be beneficial in other parts of the library. For example, `shuffle_order_engine` could replace its usage of floating point arithmetic with 128-bit integer arithmetic to improve accuracy (I think; I haven't tested if the current implementation results in problems), and a discussion of this on Discord mentioned that it could be beneficial for `generate_canonical`. In this way, the best option might be to implement a single 128-bit integer-like type that can be used across libcxx when 128-bit arithmetic is required.
 * Lastly, an additional case could simply be written that does all the long arithmetic manually. Doing it this way may provide some opportunity to optimize the calculation. For example, knowing that `(ax + c)` will always be mess than `m^2` means we can skip dividing the high part of the intermediate value. However, this will be a lot of code, and it might be better to encapsulate the code than to optimize it.

I've experimented with writing some of this myself, so I *think* I could do it if necessary, but would appreciate some feedback on these options before I make any attempts, and also would appreciate if someone else was willing to take the lead on this.

https://github.com/llvm/llvm-project/pull/81583