[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

Tue May 28 13:55:46 PDT 2019

On Tue, May 28, 2019 at 2:45 PM Francesco Petrogalli via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> Dear all,
>
> This RFC is a proposal to provide auto-vectorization functionality for user provided vector functions.
>
> The proposal is a modification of an RFC that I have sent out a couple of months ago, with the title `[RFC] Re-implementing -fveclib with OpenMP` (see http://lists.llvm.org/pipermail/llvm-dev/2018-December/128426.html). The previous RFC is to be considered abandoned.
>
> The original RFC was proposing to re-implement the `-fveclib` command line option. This proposal avoids that, and limits its scope to the mechanics of providing vector function in user code that the compiler can pick up for auto-vectorization. This narrower scope limits the impact of changes that are needed in both clang and LLVM.
>
> Please let me know what you think.
>
> Kind regards,
>
> Francesco
>
>
> =================================================================================
>
> Introduction
> ============
>
> This RFC encompasses the proposal of informing the vectorizer about the
> availability of vector functions provided by the user. The mechanism is
> based on the use of the directive `declare variant` introduced in OpenMP
> 5.0 [^1].
>
> The mechanism proposed has the following properties:
>
> 1.  Decouples the compiler front-end that knows about the availability
>     of vectorized routines, from the back-end that knows how to make use
>     of them.
> 2.  Enable support for a developer's own vector libraries without
>     requiring changes to the compiler.
> 3.  Enables other frontends (e.g. f18) to add scalar-to-vector function
>     mappings as relevant for their own runtime libraries, etc.
There is no way to notify the backend how conformant the SIMD versions
are. While the initial spec said that floating point status registers
would be not supported, this is not difficult to do and the
implementations that I wrote support this[1]. Then if a status
register is set after a calculation, the calculation can be run with
the scalar versions to determine exactly which operation(s) causes it.

[1]https://sourceware.org/ml/libc-alpha/2019-05/msg00595.html