[llvm-dev] [RFC] Expose user provided vector function for auto-vectorization.

Finkel, Hal J. via llvm-dev llvm-dev at lists.llvm.org
Tue May 28 19:52:45 PDT 2019

On 5/28/19 4:11 PM, Renato Golin wrote:
> Hi Francesco,
> Nice to finally see this RFC, thanks! :)
> Overall, I like the proposal. Clean, concise and complete.
> I have a few comments inline, but from a high level this is looking good.


> On Tue, 28 May 2019 at 20:44, Francesco Petrogalli
> <Francesco.Petrogalli at arm.com> wrote:
>> The directive `#pragma clang declare variant` follows the syntax of the
>> `#pragma omp declare variant` directive of OpenMP.
>> We define the new directive in the `clang` namespace instead of using
>> the `omp` one of OpenMP to allow the compiler to perform
>> auto-vectorization outside of an OpenMP SIMD context.
> So, the only difference is that pragma "omp" includes and links OMP
> stuff, while pragma "clang" doesn't, right?

I'm assuming that the difference is simpler: We don't process OpenMP 
directives by default, but we will process these Clang pragmas by default.

> What happens if I have code with "pragma clang declare variant" and
> "pragma omp" elsewhere, would the clang pragma behave identically as
> if it was omp?

I think that this is an interesting question. My preference is that draw 
a distinction between 'system' directives (i.e., things provided by 
system headers, and by headers from libraries treated like system 
libraries) and user-provided directives. Then we process 
directly-conflicting directives in the following order:

Lowest priority: #pragma clang declare system variant

Medium priority: #pragma omp declare variant

Highest priority: #pragma clang declare variant

My logic is this: We should have a way for users to override variants 
provided by system headers. If users write a variant using OpenMP, it 
should override a system-provided variant. The compiler-specific (Clang) 
variant that a user provides should have the highest priority (because 
it's a Clang pragma and we're Clang). As with the general OpenMP scheme, 
more-specific variants should have priority over more-general variants 
(regardless of whether they're OpenMP variants or Clang variants).

>> The mechanism is base on OpenMP to provide a uniform user experience
>> across the two mechanism, and to maximise the number of shared
>> components of the infrastructure needed in the compiler frontend to
>> enable the feature.
>> Changes in LLVM IR {#llvmIR}
>> ------------------
>> The IR is enriched with metadata that details the availability of vector
>> versions of an associated scalar function. This metadata is attached to
>> the call site of the scalar function.
> If the metadata gets dropped by some middle-end pass, the user will be
> confused why the vector function is not being called.

Why metadata and not a call-site/function attribute?

> Do we have that problem with OMP stuff already? If so, how do we fix this?

I don't think that we currently use metadata like this for OpenMP in any 
relevant sense, and so we don't currently have this problem.

>>      // ...
>>      ... = call double @sin(double) #0
>>      // ...
>>      #0 = { vector-variant = {"_ZGVcN2v_sin(__svml_sin2),
>>                                _ZGVdN4v_sin(__svml_sin4),
>>                                ..."} }
> I'm assuming in all ABIs, the arguments and return values are
> guaranteed to be the same, but vector versions. Ie. there are no
> special arguments / flags / status regs that are used / changed in the
> vector version that the compiler will have to "just know". If the
> whole point is that this is a "variant", having specialist knowledge
> for random variant ABIs won't scale.
> Is that a problem or can we class those as "user-error"?
>> The SVFS can add new function definitions, in the same module as the
>> `Call`, to provide vector functions that are not present within the
>> vector-variant metadata. For example, if a library provides a vector
>> version of a function with a vectorization factor of 2, but the
>> vectorizer is requesting a vectorization factor of 4, the SVFS is
>> allowed to create a definition that calls the 2-lane version twice. This
>> capability applies similarly for providing masked and unmasked versions
>> when the request does not match what is available in the library.
> Nice! Those thunks will play nicely with inlining later, sort of like unrolling.
>> The `construct` set in the directive, together with the `device` set, is
>> used to generate the vector mangled name to be used in the
>> `vector-variant` attribute, for example `_ZGVnN2v_sin`, when targeting
>> AArch64 Advanced SIMD code generation. The rule for mangling the name of
>> the scalar function in the vector name are defined in the the Vector
>> Function ABI specification of the target.
> And I assume the user is responsible for linking the libraries that
> export those signatures or they will have a linking error.

Either user, or the driver (based on whatever compiler flags enables 
targeting the vector-math library in the first place). I'm hoping that 
the driver can get this on it's own in all common cases.

Thanks again,


>   Giving that
> this is user definition (in the pragma), and that we really can't know
> if the library will be available at link time, it's the only thing we
> can do.
> cheers,
> --renato

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

More information about the llvm-dev mailing list