[Parallel_libs-commits] [PATCH] D24951: Import/adapt the SLEEF vector math-function library as an LLVM runtime
Hal Finkel via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Mon Sep 26 18:32:09 PDT 2016
hfinkel created this revision.
hfinkel added a subscriber: parallel_libs-commits.
Herald added subscribers: mgorny, beanz, mcrosier, aemerson.
This represents the start of my work to import and adapt the SLEEF vector math-function library, authored by Naoki Shibata, to LLVM. See https://github.com/shibatch/sleef for the original source. For the RFC, see: http://lists.llvm.org/pipermail/llvm-dev/2016-July/102254.html
I've not changed any of the meat of the implementation in order to make this patch; but I've tried to make is more like a runtime library (and I've made the source files C++ files instead of C files). All of the external functions start with __. The largest issue is is how to deal with vector ISA/ABI compatibility. I've tried to properly separate the concerns of:
1. For what processor is the runtime library itself being compiled
2. For what vector ABIs are vectorized functions being made available
Aside from the scalar versions, which are pure C/C++ and always compiled, vector versions are compiled when possible. For example, we have __xsin and __xsinf, the scalar versions, __xsin__sse2 and __xsinf__sse2 (which use __m128d and __m128 types), __xsin__avx and __xsinf__avx (which use __m256d and __m256), and __xsin__avx2 and __xsinf__avx2 (which also use __m256d and __m256, although some functions use a different integer type compared to the avx versions). As many of these variants as possible are compiled into the library simultaneously.
The library is implemented using intrinsics, not assembly, and so the associated target features must be enabled in the compiler to compile the relevant versions of these functions. By default, compilers on x86 often only enable support for SSE2, (i.e. not AVX or later ISAs). When the compiler will support adding flags to turn on AVX, AVX2, etc. the build will do that, but only the files which require it. This is important because if you're building for an older core (or just trying to the portable), you don't want the compiler to start generating AVX instructions inside your SSE2 functions.
For ARM, NEON is supported (although only single-precision currently).
I've not yet dealt with testing; the source on github has testing programs. They make use of mpfr (a dependency I doubt we want), and, in part, perform randomized testing (and, at least for regression tests, we probably don't want that either). We need to figure out what we want to do here.
In any case, there's a lot of discuss here about code structure, naming conventions, testing, etc.
https://reviews.llvm.org/D24951
Files:
CMakeLists.txt
sleef/CMakeLists.txt
sleef/include/__sleef.def
sleef/include/sleef.h
sleef/lib/CMakeLists.txt
sleef/lib/avx.h
sleef/lib/avx2.h
sleef/lib/dd.h
sleef/lib/df.h
sleef/lib/dp-avx.cpp
sleef/lib/dp-avx2.cpp
sleef/lib/dp-scalar.cpp
sleef/lib/dp-sse2.cpp
sleef/lib/dp.cpp
sleef/lib/fma4.h
sleef/lib/isa.h
sleef/lib/neon.h
sleef/lib/nonnumber.h
sleef/lib/sp-avx.cpp
sleef/lib/sp-avx2.cpp
sleef/lib/sp-neon.cpp
sleef/lib/sp-scalar.cpp
sleef/lib/sp-sse2.cpp
sleef/lib/sp.cpp
sleef/lib/sse2.h
sleef/unittests/CMakeLists.txt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D24951.72593.patch
Type: text/x-patch
Size: 257986 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/parallel_libs-commits/attachments/20160927/dd3f8ecf/attachment-0001.bin>
More information about the Parallel_libs-commits
mailing list