[libc-commits] [PATCH] D115828: [libc] Implement correctly rounded log2f based on RLIBM library.

Santosh Nagarakatte via Phabricator via libc-commits libc-commits at lists.llvm.org
Thu Dec 23 05:54:53 PST 2021

santoshn added a comment.

In D115828#3207780 <https://reviews.llvm.org/D115828#3207780>, @zimmermann6 wrote:

> Dear Santosh,
>> Here is a new polynomial that is generated using the exact output compensation in the patch. Tue suggested to use the FMA based poly eval as he was observing performance regressions with the SIMD instruction in x86-64.
>> Polynomial: y=1.4426950408936214387267682468518614768981933593750000000000000000000000e+00 x^(1) + -7.2134752892795794831926059487159363925457000732421875000000000000000000e-01 x^(2) + 4.8090233829603024062748772848863154649734497070312500000000000000000000e-01 x^(3) + -3.6137987525825709944626851211069151759147644042968750000000000000000000e-01 x^(4) + 3.2929554893140711158139311010017991065979003906250000000000000000000000e-01 x^(5)
>> Polynomial evaluation used is as follows:
>> double t1 = fma(x, a5, a4);
>>  double t2 = fma(x, t1, a3);
>>  double t3 = fma(x, t2, a2);
>>  double t4 = fma(x, t3, a1);
>> final result = fma(d, t4, extra_factor)
>> Can you check if it produces correctly rounded results for all inputs and all rounding modes?
> sure. If I converted the coefficients properly to hexadecimal values,
> there is still one incorrectly rounded result for rounding towards zero
> or down (same input x):
> libm wrong by up to 1.01e+00 ulp(s) [1] for x=0x1.03a16ap+0
> log2      gives 0x1.4cdc4ap-6
> mpfr_log2 gives 0x1.4cdc4cp-6
> Best regards,
> Paul

Dear Paul,

Thanks for testing it out.

I am seeing the exact oracle eresult in the local build for input 0x1.03a16ap+0. the result produced by the implementation is 0x1.4cdc4cp-6.

Have you commented out lines src/__support/FPUtil/PolyEval.h:38-42.

If you have not, it is most likely using  using x86-64 SIMD extensions that Tue suggested us not to use. It could be one reason for the divergence we are seeing.

Thinking out loud: the round-to-nearest result for this input is exactly the same as round-to-zero result 0x1.4cdc4cp-6. Given that the implementation is producing the correct round-to-nearest, and disagreeing only for round-to-zero, is it possible that there is a bug in the test harness that checks round-to-zero results?


  rG LLVM Github Monorepo



More information about the libc-commits mailing list