[all-commits] [llvm/llvm-project] 35553d: [mlir] Add polynomial approximation for vectorized...
Emilio Cota via All-commits
all-commits at lists.llvm.org
Sat Oct 23 04:56:26 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 35553d452b32e9356352df8536fa0485207a9274
https://github.com/llvm/llvm-project/commit/35553d452b32e9356352df8536fa0485207a9274
Author: Emilio Cota <ecg at google.com>
Date: 2021-10-23 (Sat, 23 Oct 2021)
Changed paths:
M mlir/include/mlir/Dialect/Math/Transforms/Passes.h
M mlir/lib/Dialect/Math/Transforms/CMakeLists.txt
M mlir/lib/Dialect/Math/Transforms/PolynomialApproximation.cpp
M mlir/test/Dialect/Math/polynomial-approximation.mlir
M mlir/test/lib/Dialect/Math/CMakeLists.txt
M mlir/test/lib/Dialect/Math/TestPolynomialApproximation.cpp
A mlir/test/mlir-cpu-runner/X86Vector/lit.local.cfg
A mlir/test/mlir-cpu-runner/X86Vector/math_polynomial_approx_avx2.mlir
M utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
M utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
Log Message:
-----------
[mlir] Add polynomial approximation for vectorized math::Rsqrt
This patch adds a polynomial approximation that matches the
approximation in Eigen.
Note that the approximation only applies to vectorized inputs;
the scalar rsqrt is left unmodified.
The approximation is protected with a flag since it emits an AVX2
intrinsic (generated via the X86Vector). This is the only reasonably
clean way that I could find to generate the exact approximation that
I wanted (i.e. an identical one to Eigen's).
I considered two alternatives:
1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet.
I believe this is because there is no definition of Rsqrt that
all backends could agree on, since hardware instructions that
implement it have widely varying degrees of precision.
This is something that the standard could mandate, but Rsqrt is
not part of IEEE754, so I don't think this option is feasible.
2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal
transformations. Although portable, this doesn't allow us
to generate exactly the code we want; it is the LLVM backend,
and not MLIR, who controls what code is generated based on the
target CPU.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D112192
More information about the All-commits
mailing list