RFC: Enable vectorization of call instructions in the loop vectorizer

Mon Dec 16 13:08:13 PST 2013

On Dec 16, 2013, at 2:59 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
>> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
>> To: "James Molloy" <James.Molloy at arm.com>
>> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
>> Sent: Monday, December 16, 2013 12:03:02 PM
>> Subject: Re: RFC: Enable vectorization of call instructions in the loop	vectorizer
>> 
>> 
>> On Dec 16, 2013, at 11:08 AM, James Molloy <James.Molloy at arm.com>
>> wrote:
>> 
>>> Hi Renato, Nadav,
>>> 
>>> Attached is a proof of concept[1] patch for adding the ability to
>>> vectorize calls. The intended use case for this is in domain
>>> specific languages such as OpenCL where tuned implementation of
>>> functions for differing vector widths exist and can be guaranteed
>>> to be semantically the same as the scalar version.
>>> 
>>> I’ve considered two approaches to this. The first was to create a
>>> set of hooks that allow the LoopVectorizer to interrogate its
>>> client as to whether calls are vectorizable and if so, how. Renato
>>> argued that this was suboptimal as it required a client to invoke
>>> the LoopVectorizer manually and couldn’t be tested through opt. I
>>> agree.
>> 
>> I don’t understand this argument.
>> 
>> We could extend target library info with additional api calls to
>> query whether a function is vectorizable at a vector factor.
>> This can be tested by providing the target triple string (e.g “target
>> triple = x86_64-gnu-linux-with_opencl_vector_lib") in the .ll file
>> that informs the optimizer that a set of vector library calls is
>> available.
>> 
>> The patch seems to restrict legal vector widths dependent on
>> available vectorizable function calls. I don’t think this should
>> work like this.
>> I believe, there should be an api on TargetTransformInfo for library
>> function calls. The vectorizer chooses the cheapest of either an
>> intrinsic call or a library function call.
>> The overall cost model determines which VF will be chosen.
> 
> We don't have a good model currently for non-intrinsic function calls. Once we do, we'll want to know how expensive the vectorized versions are compared to the scalar version. Short of that, I think that a reasonable approximation is that any function calls will be the most expensive things in a loop, and the ability to vectorize them will be the most important factor in determining the vectorization factor.

Yes and we can easily model this in the cost model by asking what is the cost of a (library) function call (vectorized or not) and have this return a reasonably high value.

> 
> -Hal
> 
>> 
>> Thanks,
>> Arnold
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory