RFC: Enable vectorization of call instructions in the loop vectorizer
hfinkel at anl.gov
Mon Dec 16 12:59:40 PST 2013
----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "James Molloy" <James.Molloy at arm.com>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
> Sent: Monday, December 16, 2013 12:03:02 PM
> Subject: Re: RFC: Enable vectorization of call instructions in the loop vectorizer
> On Dec 16, 2013, at 11:08 AM, James Molloy <James.Molloy at arm.com>
> > Hi Renato, Nadav,
> > Attached is a proof of concept patch for adding the ability to
> > vectorize calls. The intended use case for this is in domain
> > specific languages such as OpenCL where tuned implementation of
> > functions for differing vector widths exist and can be guaranteed
> > to be semantically the same as the scalar version.
> > I’ve considered two approaches to this. The first was to create a
> > set of hooks that allow the LoopVectorizer to interrogate its
> > client as to whether calls are vectorizable and if so, how. Renato
> > argued that this was suboptimal as it required a client to invoke
> > the LoopVectorizer manually and couldn’t be tested through opt. I
> > agree.
> I don’t understand this argument.
> We could extend target library info with additional api calls to
> query whether a function is vectorizable at a vector factor.
> This can be tested by providing the target triple string (e.g “target
> triple = x86_64-gnu-linux-with_opencl_vector_lib") in the .ll file
> that informs the optimizer that a set of vector library calls is
> The patch seems to restrict legal vector widths dependent on
> available vectorizable function calls. I don’t think this should
> work like this.
> I believe, there should be an api on TargetTransformInfo for library
> function calls. The vectorizer chooses the cheapest of either an
> intrinsic call or a library function call.
> The overall cost model determines which VF will be chosen.
We don't have a good model currently for non-intrinsic function calls. Once we do, we'll want to know how expensive the vectorized versions are compared to the scalar version. Short of that, I think that a reasonable approximation is that any function calls will be the most expensive things in a loop, and the ability to vectorize them will be the most important factor in determining the vectorization factor.
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits