[llvm-dev] [RFC] Enable "#pragma omp declare simd" in the LoopVectorizer

Tian, Xinmin via llvm-dev llvm-dev at lists.llvm.org
Wed Nov 30 09:16:12 PST 2016


Hi Francesco,

Good to know, you are working on the support for this feature. I assume you knew the RFC below.  The VectorABI mangling we proposed were approved by C++ Clang FE name mangling owner David M from Google,  the ClangFE support was committed in its main trunk by Alexey. 

“Proposal for function vectorization and loop vectorization with function calls”, March 2, 2016. Intel Corp.  http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html.

Matt submitted patch to generate vector variants for function definitions, not just function declarations. You may want to take a look.  Ayal's RFC will be also needed to support vectorization of function body in general. 

I agreed, we should have an option -fopenmp-simd to enable SIMD only, both GCC and ICC have similar options. 

I would suggest we shall sync-up on these work, so we don't duplicate the effort. 

Thanks,
Xinmin

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Francesco Petrogalli via llvm-dev
Sent: Wednesday, November 30, 2016 7:11 AM
To: llvm-dev at lists.llvm.org
Cc: nd <nd at arm.com>
Subject: [llvm-dev] [RFC] Enable "#pragma omp declare simd" in the LoopVectorizer

Dear all,

I have just created a couple of differential reviews to enable the vectorisation of loops that have function calls to routines marked with “#pragma omp declare simd”.

They can be (re)viewed here:

* https://reviews.llvm.org/D27249
	
* https://reviews.llvm.org/D27250

The current implementation allows the loop vectorizer to generate vector code for source file as:

  #pragma omp declare simd
  double f(double x);

  void aaa(double *x, double *y, int N) {
    for (int i = 0; i < N; ++i) {
      x[i] = f(y[i]);
    }
  }


by invoking clang with arguments:

  $> clang -fopenmp -c -O3 file.c […]


Such functionality should provide a nice interface for vector libraries developers that can be used to inform the loop vectorizer of the availability of an external library with the vector implementation of the scalar functions in the loops. For this, all is needed to do is to mark with “#pragma omp declare simd” the function declaration in the header file of the library and generate the associated symbols in the object file of the library according to the name scheme of the vector ABI (see notes below).

I am interested in any feedback/suggestion/review the community might have regarding this behaviour.

Below you find a description of the implementation and some notes.

Thanks,

Francesco 

-----------

The functionality is implemented as follow:

1. Clang CodeGen generates a set of global external variables for each of the function declarations marked with the OpenMP pragma. Each of such globals are named according a mangling that is generated by llvm::TargetLibraryInfoImpl (TLII), and holds the vector signature of the associated vector function. (See examples in the tests of the clang patch.
Each scalar function can generate multiple vector functions depending on the clauses of the declare simd directives) 2. When clang created the TLII, it processes the llvm::Module and finds out which of the globals of the module have the correct mangling and type so that they be added to the TLII as a list of vector function that can be associated to the original scalar one.
3. The LoopVectorizer looks for the available vector functions through the TLII not by scalar name and vectorisation factor but by scalar name and vector function signature, thus enabling the vectorizer to be able to distinguish a "vector vpow1(vector x, vector y)” from a “vector vpow2(vector x, scalar y)”. (The second one corresponds to a “declare simd uniform(y)” for a “scalar pow(scalar x, scalar y)” declaration). (Notice that the changes in the loop vectorizer are minimal.)


Notes:

1. To enable SIMD only for OpenMP, leaving all the multithread/target behaviour behind, we should enable this also with a new option:
-fopenmp-simd
2. The AArch64 vector ABI in the code is essentially the same as for the Intel one (apart from the prefix and the masking argument), and it is based on the clauses associated to “declare simd” in OpenMP 4.0. For OpenMP4.5, the parameters section of the mangled name should be updated.
This update will not change the vectorizer behaviour as all the vectorizer needs to detect a vectorizable function is the original scalar name and a compatible vector function signature. Of course, any changes/updates in the ABI will have to be reflected in the symbols of the binary file of the library.
3. Whistle this is working only for function declaration, the same functionality can be used when (if) clang will implement the declare simd OpenMP pragma for function definitions.
4. I have enabled this for any loop that invokes the scalar function call, not just for those annotated with “#pragma omp for simd”. I don’t have any preference here, but at the same time I don’t see any reason why this shouldn’t be enabled by default for non annotated loops. Let me know if you disagree, I’d happily change the functionality if there are sound reasons behind that.

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list