RFC: Enable vectorization of call instructions in the loop vectorizer

Mon Dec 16 13:09:45 PST 2013

----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "James Molloy" <James.Molloy at arm.com>
> Sent: Monday, December 16, 2013 3:08:13 PM
> Subject: Re: RFC: Enable vectorization of call instructions in the loop vectorizer
> 
> 
> On Dec 16, 2013, at 2:59 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> >> To: "James Molloy" <James.Molloy at arm.com>
> >> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
> >> Sent: Monday, December 16, 2013 12:03:02 PM
> >> Subject: Re: RFC: Enable vectorization of call instructions in the
> >> loop	vectorizer
> >> 
> >> 
> >> On Dec 16, 2013, at 11:08 AM, James Molloy <James.Molloy at arm.com>
> >> wrote:
> >> 
> >>> Hi Renato, Nadav,
> >>> 
> >>> Attached is a proof of concept[1] patch for adding the ability to
> >>> vectorize calls. The intended use case for this is in domain
> >>> specific languages such as OpenCL where tuned implementation of
> >>> functions for differing vector widths exist and can be guaranteed
> >>> to be semantically the same as the scalar version.
> >>> 
> >>> I’ve considered two approaches to this. The first was to create a
> >>> set of hooks that allow the LoopVectorizer to interrogate its
> >>> client as to whether calls are vectorizable and if so, how.
> >>> Renato
> >>> argued that this was suboptimal as it required a client to invoke
> >>> the LoopVectorizer manually and couldn’t be tested through opt. I
> >>> agree.
> >> 
> >> I don’t understand this argument.
> >> 
> >> We could extend target library info with additional api calls to
> >> query whether a function is vectorizable at a vector factor.
> >> This can be tested by providing the target triple string (e.g
> >> “target
> >> triple = x86_64-gnu-linux-with_opencl_vector_lib") in the .ll file
> >> that informs the optimizer that a set of vector library calls is
> >> available.
> >> 
> >> The patch seems to restrict legal vector widths dependent on
> >> available vectorizable function calls. I don’t think this should
> >> work like this.
> >> I believe, there should be an api on TargetTransformInfo for
> >> library
> >> function calls. The vectorizer chooses the cheapest of either an
> >> intrinsic call or a library function call.
> >> The overall cost model determines which VF will be chosen.
> > 
> > We don't have a good model currently for non-intrinsic function
> > calls. Once we do, we'll want to know how expensive the vectorized
> > versions are compared to the scalar version. Short of that, I
> > think that a reasonable approximation is that any function calls
> > will be the most expensive things in a loop, and the ability to
> > vectorize them will be the most important factor in determining
> > the vectorization factor.
> 
> Yes and we can easily model this in the cost model by asking what is
> the cost of a (library) function call (vectorized or not) and have
> this return a reasonably high value.

Sounds good to me.

 -Hal

> 
> 
> > 
> > -Hal
> > 
> >> 
> >> Thanks,
> >> Arnold
> >> 
> >> 
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory