[Parallel_libs-commits] [PATCH] D24951: Import/adapt the SLEEF vector math-function library as an LLVM runtime

Mon Sep 26 18:32:09 PDT 2016

hfinkel created this revision.
hfinkel added a subscriber: parallel_libs-commits.
Herald added subscribers: mgorny, beanz, mcrosier, aemerson.

This represents the start of my work to import and adapt the SLEEF vector math-function library, authored by Naoki Shibata, to LLVM. See https://github.com/shibatch/sleef for the original source. For the RFC, see: http://lists.llvm.org/pipermail/llvm-dev/2016-July/102254.html

I've not changed any of the meat of the implementation in order to make this patch; but I've tried to make is more like a runtime library (and I've made the source files C++ files instead of C files). All of the external functions start with __. The largest issue is is how to deal with vector ISA/ABI compatibility. I've tried to properly separate the concerns of:

 1. For what processor is the runtime library itself being compiled
 2. For what vector ABIs are vectorized functions being made available

Aside from the scalar versions, which are pure C/C++ and always compiled, vector versions are compiled when possible. For example, we have __xsin and __xsinf, the scalar versions, __xsin__sse2 and __xsinf__sse2 (which use __m128d and __m128 types), __xsin__avx and __xsinf__avx (which use __m256d and __m256), and __xsin__avx2 and __xsinf__avx2 (which also use __m256d and __m256, although some functions use a different integer type compared to the avx versions). As many of these variants as possible are compiled into the library simultaneously.

The library is implemented using intrinsics, not assembly, and so the associated target features must be enabled in the compiler to compile the relevant versions of these functions. By default, compilers on x86 often only enable support for SSE2, (i.e. not AVX or later ISAs). When the compiler will support adding flags to turn on AVX, AVX2, etc. the build will do that, but only the files which require it. This is important because if you're building for an older core (or just trying to the portable), you don't want the compiler to start generating AVX instructions inside your SSE2 functions.

For ARM, NEON is supported (although only single-precision currently).

I've not yet dealt with testing; the source on github has testing programs. They make use of mpfr (a dependency I doubt we want), and, in part, perform randomized testing (and, at least for regression tests, we probably don't want that either). We need to figure out what we want to do here.

In any case, there's a lot of discuss here about code structure, naming conventions, testing, etc.

https://reviews.llvm.org/D24951

Files:
  CMakeLists.txt
  sleef/CMakeLists.txt
  sleef/include/__sleef.def
  sleef/include/sleef.h
  sleef/lib/CMakeLists.txt
  sleef/lib/avx.h
  sleef/lib/avx2.h
  sleef/lib/dd.h
  sleef/lib/df.h
  sleef/lib/dp-avx.cpp
  sleef/lib/dp-avx2.cpp
  sleef/lib/dp-scalar.cpp
  sleef/lib/dp-sse2.cpp
  sleef/lib/dp.cpp
  sleef/lib/fma4.h
  sleef/lib/isa.h
  sleef/lib/neon.h
  sleef/lib/nonnumber.h
  sleef/lib/sp-avx.cpp
  sleef/lib/sp-avx2.cpp
  sleef/lib/sp-neon.cpp
  sleef/lib/sp-scalar.cpp
  sleef/lib/sp-sse2.cpp
  sleef/lib/sp.cpp
  sleef/lib/sse2.h
  sleef/unittests/CMakeLists.txt

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D24951.72593.patch
Type: text/x-patch
Size: 257986 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/parallel_libs-commits/attachments/20160927/dd3f8ecf/attachment-0001.bin>