[libc-commits] [PATCH] D115347: [libc] Optimize PolyEval for x86-64 with degree 3-6 polynomials.

Wed Dec 8 21:36:57 PST 2021

sivachandra added a comment.

The subject line and description are confusing. It seems to me like you have added specializations for 3rd and 5th degree polynomials. But, the description has "3-6 polynomials".

================
Comment at: libc/src/__support/FPUtil/PolyEval.h:36
+#ifdef LLVM_LIBC_ARCH_X86_64
+#include "immintrin.h"
+
----------------
Couple of things:

1. We want individual header files to be self-contained. So, this include should be moved to the header files where it is required.
2. Because of the same self-contained headers requirement, the entities defined in a header file should be in the `__llvm_libc` namespace explicitly  as opposed to ending up in a namespace when included like this here. So, these includes should be moved outside of the namespace here.

================
Comment at: libc/src/__support/FPUtil/x86_64/PolyEvalDouble.h:1
+//===-- Common header for PolyEval implementations for x86_64 double -- C++
+//-*-===//
----------------
Fix line length but not sure if the description is correct also.

================
Comment at: libc/src/__support/FPUtil/x86_64/PolyEvalFloat.h:1
+//===-- Common header for PolyEval implementations for x86_64 float --- C++
+//-*-===//
----------------
Fix line length but not sure if the description is correct also.

================
Comment at: libc/src/math/generic/CMakeLists.txt:503
+  COMPILE_OPTIONS
+    -mfma
 )
----------------
Can you add a comment explaining what in the implementation would be affected by this? AFAICT, we are either using the fma instructions directly, or are calling the fma related builtins. So, it is not clear to me as to why this should be required.

================
Comment at: libc/test/src/math/expm1f_test.cpp:111
       continue;
-    ASSERT_MPFR_MATCH(mpfr::Operation::Expm1, x, __llvm_libc::expm1f(x), 1.5);
+    ASSERT_MPFR_MATCH(mpfr::Operation::Expm1, x, __llvm_libc::expm1f(x), 2.2);
   }
----------------
If there is a possibility of trading off between performance and accuracy, we should provide build time switches for users to pick one or the other based on their needs. By default we want to "err" on the side of providing more accurate implementations over providing faster but less accurate implementations.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115347/new/

https://reviews.llvm.org/D115347