[llvm-dev] Enable "#pragma omp declare simd" in the LoopVectorizer

Wed Nov 30 07:13:01 PST 2016

I have sent out an equivalent RFC email for this functionality, as
requested in the review https://reviews.llvm.org/D27250

Please use the new thread with “RFC” in it.

Thanks,

Francesco

On 30/11/2016 11:46, "llvm-dev on behalf of Francesco Petrogalli via
llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of
llvm-dev at lists.llvm.org> wrote:

>Dear all,
>
>I have just created a couple of differential reviews to enable the
>vectorisation of loops that have function calls to routines marked with
>“#pragma omp declare simd”.
>
>They can be (re)viewed here:
>
>* https://reviews.llvm.org/D27249
>	
>* https://reviews.llvm.org/D27250
>
>The current implementation allows the loop vectorizer to generate vector
>code for source file as:
>
>  #pragma omp declare simd
>  double f(double x);
>
>  void aaa(double *x, double *y, int N) {
>    for (int i = 0; i < N; ++i) {
>    x[i] = f(y[i]);
>    }
>  }
>
>
>by invoking clang with arguments:
>
>  $> clang -fopenmp -c -O3 file.c […]
>
>
>Such functionality should provide a nice interface for vector libraries
>developers that can be used to inform the loop vectorizer of the
>availability of an external library with the vector implementation of the
>scalar functions in the loops. For this, all is needed to do is to mark
>with “#pragma omp declare simd” the function declaration in the header
>file of the library and generate the associated symbols in the object file
>of the library according to the name scheme of the vector ABI (see notes
>below).
>
>I am interested in any feedback/suggestion/review the community might have
>regarding this behaviour.
>
>Below you find a description of the implementation and some notes.
>
>Thanks,
>
>Francesco 
>
>-----------
>
>The functionality is implemented as follow:
>
>1. Clang CodeGen generates a set of global external variables for each of
>the function declarations marked with the OpenMP pragma. Each of such
>globals are named according a mangling that is generated by
>llvm::TargetLibraryInfoImpl (TLII), and holds the vector signature of the
>associated vector function. (See examples in the tests of the clang patch.
>Each scalar function can generate multiple vector functions depending on
>the clauses of the declare simd directives)
>2. When clang created the TLII, it processes the llvm::Module and finds
>out which of the globals of the module have the correct mangling and type
>so that they be added to the TLII as a list of vector function that can be
>associated to the original scalar one.
>3. The LoopVectorizer looks for the available vector functions through the
>TLII not by scalar name and vectorisation factor but by scalar name and
>vector function signature, thus enabling the vectorizer to be able to
>distinguish a "vector vpow1(vector x, vector y)” from a “vector
>vpow2(vector x, scalar y)”. (The second one corresponds to a “declare simd
>uniform(y)” for a “scalar pow(scalar x, scalar y)” declaration). (Notice
>that the changes in the loop vectorizer are minimal.)
>
>
>Notes:
>
>1. To enable SIMD only for OpenMP, leaving all the multithread/target
>behaviour behind, we should enable this also with a new option:
>-fopenmp-simd
>2. The AArch64 vector ABI in the code is essentially the same as for the
>Intel one (apart from the prefix and the masking argument), and it is
>based on the clauses associated to “declare simd” in OpenMP 4.0. For
>OpenMP4.5, the parameters section of the mangled name should be updated.
>This update will not change the vectorizer behaviour as all the vectorizer
>needs to detect a vectorizable function is the original scalar name and a
>compatible vector function signature. Of course, any changes/updates in
>the ABI will have to be reflected in the symbols of the binary file of the
>library.
>3. Whistle this is working only for function declaration, the same
>functionality can be used when (if) clang will implement the declare simd
>OpenMP pragma for function definitions.
>4. I have enabled this for any loop that invokes the scalar function call,
>not just for those annotated with “#pragma omp for simd”. I don’t have any
>preference here, but at the same time I don’t see any reason why this
>shouldn’t be enabled by default for non annotated loops. Let me know if
>you disagree, I’d happily change the functionality if there are sound
>reasons behind that.
>
>_______________________________________________
>LLVM Developers mailing list
>llvm-dev at lists.llvm.org
>http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev