[llvm-dev] RFC: Interface user provided vector functions with the vectorizer.
Saito, Hideki via llvm-dev
llvm-dev at lists.llvm.org
Mon Jun 24 10:22:26 PDT 2019
That helps complex but not other structures that can be call-by-value.
Hideki
-----Original Message-----
From: David Greene [mailto:dag at cray.com]
Sent: Monday, June 24, 2019 9:40 AM
To: Francesco Petrogalli via llvm-dev <llvm-dev at lists.llvm.org>
Cc: Doerfert, Johannes <jdoerfert at anl.gov>; Francesco Petrogalli <Francesco.Petrogalli at arm.com>; nd <nd at arm.com>; Andrea Bocci <andrea.bocci at cern.ch>; Clang Dev <cfe-dev at lists.llvm.org>; Saito, Hideki <hideki.saito at intel.com>; Alexey Bataev <a.bataev at hotmail.com>
Subject: Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.
I have an RFC for first-class complex types in LLVM IR pending for some internal review. I hope to post it soon. That should help address this problem. Then the vector function signature generation could stay in LLVM, if I'm understanding the issue correctly.
-David
Francesco Petrogalli via llvm-dev <llvm-dev at lists.llvm.org> writes:
> Hi all - I am working with a colleague to provide an initial implementation of this.
>
> We encountered a problem when dealing with generating the vector signatures of functions that use complex data.
>
> In this proposal, we expect the SVFS component in the backed to
> demangle the name of the function in the attribute to be able to
> reconstruct the signature of the vector function from the scalar
> function signature.
>
> In case of Complex data, this doesn’t seem to be possible, because the
> information of “being a vector of 2 lanes” that is supposed to be
> carried by the complex scalar is lost in the transformation the data
> type in a “coerced” type.
>
> Consider these three types and the function `foo`:
>
> // Type 1
> typedef _Complex int S;
>
> // Type 2
> typedef struct x{
> int a;
> int b;
> } S;
>
> // Type 3
> typedef uint64_t S;
>
> S foo(S a, S b) {
> return ...;
> }
>
> In all cases, the IR type of the parameters in `foo` is i64, therefore
> is not possible to distinguish what C type generated the signature of
> `foo`.
>
> I don’t know if this is going to be a problem for other architectures,
> but this is definitely a problem on AArch64 where we need to be able
> to generate the correct vector function signature for a specific
> simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector
> type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.
>
> Therefore, I would like to propose a change to the RFC, which would
> move the responsibility off generating the vector function signature
> from LLVM to clang.
>
> In particular, (and this I believe has already been mentioned by
> Johannes), we could use the @llvm.compiler.used intrinsic to mark
> those declaration that needs to stay in the IR and not optimized away
> OPT before reaching the vectorizer.
>
> In summary, the change would consist of:
>
> 1. Generate symbols declaration/definitions of the vector function
> with the mangled name in the IR, and mark it with @llvm-compiler.used.
> This could be done in CGOpenMPRuntime.cpp 2. Use the attribute
> vector-abs-variant defined in this RFC to map scalar names to vector
> ABI mangled name, and used the same redirection mechanism for the user
> provided vector name.
> 3. Move the “vector function signature generation” from the SVFS in
> LLVM to the openmp code generator of the clang frontend
>
> The SVFS query system would still work as in the current proposal. The
> only difference is that the vector function signature would be given
> by the frontend and not need to be recomputed.
>
> Here is an example of ho the IR would look like with this change:
>
> ```
> @llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"
>
> declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)
>
> declare i32 @foo(i32) #0
>
> ; other function definition, including the one provided by the user
> `my_vector_foo` if the user provided a definition and not just the
> declaration
>
> attribute #0 =
> {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}
>
> ```
>
> If the attribute @llvm.compiler.used is not suitable for this (I am
> not aware of all implication of using it on a global symbol), maybe we
> could come up with a intrinsics that does what we need (avoid deleting
> declarations that are not used) and name it
> @llvm.vector.function.used?
>
> Please let me know what you think, I will submit an updated proposal next week.
>
> Kind regards,
>
> Francesco
>
>> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <jdoerfert at anl.gov> wrote:
>>
>> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>>
>> Sorry for the delay and thanks for working on this.
>>
>> Get Outlook for Android
>>
>> From: Simon Moll <moll at cs.uni-saarland.de>
>> Sent: Monday, June 17, 2019 10:02:58 AM
>> To: Francesco Petrogalli; LLVM Development List; Clang Dev
>> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
>> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>>
>> Hi Francesco,
>>
>> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
>> > Dear all,
>> >
>> > I have re-written the proposal for interfacing user provided vector
>> > functions, originally posted in both llvm-dev and cfe-dev mailing
>> > list:
>> >
>> > "[RFC] Expose user provided vector function for auto-vectorization."
>> >
>> > The proposal looks quite different from the original submission,
>> > therefore I took the liberty to start a new thread.
>> >
>> > The original thread generated some good discussion. In particular,
>> > Simon Moll and Johannes Doerfert (CCed) have managed to provide
>> > good arguments for the following claims:
>> >
>> > 1. The Vector Function ABI name mangling scheme of a target is not
>> > enough to describe all uses cases of function vectorization that
>> > the compiler might end up needing to support in the future.
>> I think the new name of the attribute makes this point clear.
>> > 2. `declare variant` needs to be handled properly at IR level, to be
>> > able to give the compiler the full OpenMP context of the directive.
>> >
>> > This proposal addresses those two concerns and other (I believe)
>> > minor concerns that have been raised in the previous thread.
>> >
>> > This proposal is provided with examples and a self assessment
>> > around extendibility.
>> >
>> > I have CCed all the people that have participated in the discussion
>> > so far, please let me know if you think I have missed anything of
>> > what have been raised.
>> >
>> > Kind regards,
>> >
>> > Francesco
>>
>> LGTM. Please add me as a reviewer for this when you post patches.
>>
>> Thanks!
>>
>> Simon
>>
>> >
>> > *** DRAFT OF THE PROPOSAL ***
>> >
>> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
>> >
>> > Because the users care about portability (across compilers,
>> > libraries and systems), I believe we have to base sour solution on
>> > a standard that describes the mapping from the scalar function to
>> > the vector function.
>> >
>> > Because OpenMP is standard and widely used, we should base our
>> > solution on the mechanisms that the standard provides, via the
>> > directives `declare simd` and `declare variant`, the latter when
>> > used in with the `simd` trait in the `construct` set.
>> >
>> > Please notice that:
>> >
>> > 1. The scope of the proposal is not implementing full support for
>> > `pragma omp declare variant`.
>> > 2. The scope of the proposal is not enabling the vectorizer to do new
>> > kind of vectorizations (e.g. RV-like vectorization described by
>> > Simon).
>> > 3. The proposal aims to be extendible wrt 1. and 2.
>> > 4. The IR attribute introduced in this proposal is equivalent to the
>> > one needed for the VecClone pass under development in
>> > https://reviews.llvm.org/D22792> > # CLANG COMPONENTS
>> >
>> > A C function attribute, `clang_declare_simd_variant`, to attach to
>> > the scalar version. The attribute provides enough information to
>> > the compiler about the vector shape of the user defined function.
>> > The vector shapes handled by the attribute are those handled by the
>> > OpenMP standard via `declare simd` (and no more than that).
>> >
>> > 1. The function attribute handling in clang is crafted with the
>> > requirement that it will be possible to re-use the same components
>> > for the info generated by `declare variant` when used with a `simd`
>> > traits in the `construct` set.
>> > 2. The attribute allows orthogonality with the vectorization that is
>> > done via OpenMP: the user vector function is still exposed for
>> > vectorization when not using `-fopenmp-[simd]` once the `declare
>> > simd` and `declare variant` directive of OpenMP will be available
>> > in the front-end.
>> >
>> > ## C function attribute: `clang_declare_simd_variant`
>> >
>> > The definition of this attribute has been crafted to match the
>> > semantics of `declare variant` for a `simd` construct described in
>> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
>> > and `arch`, which I believe are enough to cover for the use case of
>> > this proposal. If that is not the case, please provide an example,
>> > extending the attribute will be easy even once the current one is
>> > implemented.
>> >
>> > ```
>> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
>> > <context selector clauses>})
>> >
>> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
>> > for C++, a template-id.
>> >
>> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
>> >
>> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
>> >
>> > <mask> := inbranch | notinbranch
>> >
>> > <optional simd clauses> := <linear clause>
>> > | <uniform clause>
>> > | <align clause> | {,<optional simd
>> > clauses>}
>> >
>> > <linear clause> := linear_ref(<var>,<step>)
>> > | linear_var(<var>, <step>)
>> > | linear_uval(<var>, <step>)
>> > | linear(<var>, <step>)
>> >
>> > <step> := <var> | <non zero number>
>> >
>> > <uniform clause> := uniform(<var>)
>> >
>> > <align clause> := align(<var>, <positive number>)
>> >
>> > <var> := Name of a parameter in the scalar function
>> > declaration/definition
>> >
>> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
>> >
>> > <positive number> := 1 | 2 | 3 | ...
>> >
>> > <context selector clauses> := {<isa>}{,} {<arch>}
>> >
>> > <isa> := isa(target-specific-value)
>> >
>> > <arch> := arch(target-specific-value)
>> >
>> > ```
>> >
>> > # LLVM COMPONENTS:
>> >
>> > ## VectorFunctionShape class
>> >
>> > The object `VectorFunctionShape` contains the information about the
>> > kind of vectorization available for an `llvm::Call`.
>> >
>> > The object `VectorFunctionShape` must contain the following information:
>> >
>> > 1. Vectorization Factor (or number or concurrent lanes executed by the
>> > SIMD version of the function). Encoded by unsigned integer.
>> > 2. Whether the vector function is requested for scalable
>> > vectorization, encoded by a boolean.
>> > 3. Information about masking / no masking, encoded by a boolean.
>> > 4. Information about the parameters, encoded in a container that
>> > carries objects of type `ParamaterType`, to describe features like
>> > `linear` and `uniform`.
>> > 5. Function name redirection, if a user has specified to use a custom
>> > name instead of the Vector Function ABI ones.
>> >
>> > Items 1. to 5. represents the information stored in the
>> > `vector-function-abi-variant` attribute (see next section).
>> >
>> > The object can be extended in the future to include new
>> > vectorization kinds (for example the RV-like vectorization of the
>> > Region Vectorizer), or to add more context information that might
>> > come from other uses of OpenMP `declare variant`, or to add new
>> > Vector Function ABIs not based on OpenMP. Such information can be
>> > retrieved by attributes that will be added to describe the `Call` instance.
>> >
>> > ## IR Attribute
>> >
>> > We define a `vector-function-abi-variant` attribute that lists the
>> > mangled names produced via the mangling function of the Vector
>> > Function ABI rules.
>> >
>> > ```
>> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
>> > ```
>> >
>> > 1. Because we use only OpenMP `declare simd` vectorization, and
>> > because we require a vector Function ABI, we make this explicit
>> > in the name of the attribute.
>> > 2. Because the Vector Function ABIs encode all the information
>> > needed to know the vectorization shape of the vector function in
>> > the mangled names, we provide the mangled name via the
>> > attribute.
>> > 3. Function names redirection is specified by enclosing the name of
>> > the redirection in parenthesis, as in
>> > `abi_mangled_name_02(user_redirection)`.
>> >
>> > ## Vector ABI Demangler
>> >
>> > The “Vector ABI demangler”, is the component that demangles the
>> > data in the `vector-function-abi-variant` attribute and that
>> > provides the instances of the class `VectorFunctionShape` that can
>> > be derived by the mangled names listed in the attribute.
>> >
>> > ## Query interface: Search Vector Function System (SVFS)
>> >
>> > An interface that can be queried by the LLVM components to
>> > understand whether or not a scalar function can be vectorized, and
>> > that retrieves the vector function to be used if such vector shape is available.
>> >
>> > 1. This component is going to be unrelated to OpenMP.
>> > 2. This component will use internally the demangler defined in the
>> > previous section, but it will not expose any aspect of the Vector
>> > Function ABI via its interface.
>> >
>> > The interface provides two methods.
>> >
>> > ```
>> > std::vector<VectorFunctionShape>
>> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
>> >
>> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
>> > VectorFunctionShape Info); ```
>> >
>> > The first method is used to list all the vector shapes that
>> > available and attached to a scalar function. An empty results means
>> > that no vector versions are available.
>> >
>> > The second method retrieves the information needed to build a call
>> > to a vector function with a specific `VectorFunctionShape` info.
>> >
>> > # (SELF) ASSESSMENT ON EXTENDIBILITY
>> >
>> >
>> > 1. Extending the C function attribute `clang_declare_simd_variant` to
>> > new Vector Function ABIs that use OpenMP will be straightforward
>> > because the attribute is tight to such ABIs and OpenMP.
>> > 2. The C attribute `clang_declare_simd_variant` and the `declare
>> > variant` directive used for the `simd` trait will be sharing the
>> > internals in clang, so adding the OpenMP functionality for `simd`
>> > traits will be mostly handling the directive in the OpenMP
>> > parser. How this should be done is described in
>> > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute> > 3. The IR attribute `vector-function-abi-variant` is not to be
>> > extended to represent other kind of vectorization other than those
>> > handled by `declare simd` and that are handled with a Vector
>> > Function ABI.
>> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
>> > extended to represent the information of `declare variant` in its
>> > totality.
>> > 5. The IR attribute will not need to change when we will introduce non
>> > vector function ABI vectorization (RV-like, reductions...) or when
>> > we will decide to fully support `declare variant`. The information
>> > it carries will not need to be invalidated, but just extended with
>> > new attributes that will need to be handled by the
>> > `VectorFunctionShape` class, in a similar way the
>> > `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
>> > operates on individual attributes to describe an overall
>> > functionality.
>> >
>> > # Examples
>> >
>> > ## Example 1
>> >
>> > Exposing an Advanced SIMD vector function when targeting Advanced
>> > SIMD in AArch64.
>> >
>> > ```
>> > double foo_01(double Input)
>> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
>> > simdlen(2), notinbranch, isa("simd"));
>> >
>> > // Advanced SIMD version
>> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
>> > ```
>> >
>> > ## Example 2
>> >
>> > Exposing an Advanced SIMD vector function when targeting Advanced
>> > SIMD in AArch64, but with the wrong signature. The user specifies a
>> > masked version of the function in the clauses of the attribute, the
>> > compiler throws an error suggesting the signature expected for
>> > ``vector_foo_02.``
>> >
>> > ```
>> > double foo_02(double Input)
>> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
>> > simdlen(2), inbranch, isa("simd"));
>> >
>> > // Advanced SIMD version
>> > float64x2_t vector_foo_02(float64x2_t VectorInput);
>> > // (suggested) compiler error -> ^ Missing mask parameter of type `uint64x2_t`.
>> > ```
>> >
>> > ## Example 3
>> >
>> > Targeting `sincos`-like signatures.
>> >
>> > ```
>> > void foo_03(double Input, double * Output)
> __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2),
> notinbranch, linear(Output, 1), isa("simd"));
>> >
>> > // Advanced SIMD version
>> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
>> > ```
>> > ## Example 4
>> >
>> > Scalable vectorization targeting SVE
>> >
>> > ```
>> > double foo_04(double Input)
> __attribute__(clang_declare_simd_variant(“vector_foo_04",
> simdlen("scalable"), notinbranch, isa("sve"));
>> >
>> > // SVE version
>> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
>> > ```
>> >
>> > ## Example 5
>> >
>> > Fixed length vectorization targeting SVE
>> >
>> > ```
>> > double foo_05(double Input)
>> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
>> > simdlen(4), inbranch, isa("sve"));
>> >
>> > // Fixed-length SVE version
>> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
>> > ```
>> >
>> > ## Example 06
>> >
>> > This is an x86 example, equivalent to the one provided by Andrei
>> > Elovikow in
>> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
>> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ> > ```
>> > float MyAdd(float* a, int b)
> __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8),
> notinbranch, arch("core_2nd_gen_avx"))
>> > {
>> > return *a + b;
>> > }
>> >
>> >
>> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
>> > ```
>> >
>> > ## Example showing interaction with `declare simd`
>> >
>> > ```
>> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
>> > *a, int x)
> __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4),
> linear(a), notinbranch, arch("armv8.2-a+simd")) {
>> > return *a + x;
>> > }
>> >
>> > // Advanced SIMD version
>> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
>> > implementation.
>> > }
>> > ```
>> >
>> > The resulting IR attribute is made of three symbols:
>> >
>> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
>> > ones the compiler builds by auto-vectorizing `foo_06` according to
>> > the rule defined in the Vector Function ABI specifications for
>> > AArch64.
>> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
>> > user-defined redirection of the 4-lane version of `foo_06` to the
>> > custom implementation provided by the user when targeting Advanced
>> > SIMD for version 8.2 of the A64 instruction set.
>> >
>> > ```
>> > attribute #0 =
>> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_
>> > ZGVnN4l4v_foo_06(vector_foo_06)"}
>> > ```
>> >
>> --
>>
>> Simon Moll
>> Researcher / PhD Student
>>
>> Compiler Design Lab (Prof. Hack)
>> Saarland University, Computer Science Building E1.3, Room 4.31
>>
>> Tel. +49 (0)681 302-57521 : moll at cs.uni-saarland.de Fax. +49 (0)681
>> 302-3065 : http://compilers.cs.uni-saarland.de/people/moll
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list