RFC: Enable vectorization of call instructions in the loop vectorizer

Mon Dec 16 13:34:02 PST 2013

----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "James Molloy" <james at jamesmolloy.co.uk>, "llvm-commits" <llvm-commits at cs.uiuc.edu>
> Sent: Monday, December 16, 2013 3:32:03 PM
> Subject: Re: RFC: Enable vectorization of call instructions in the loop vectorizer
> 
> 
> On Dec 16, 2013, at 3:14 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> From: "James Molloy" <james at jamesmolloy.co.uk>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: "Arnold Schwaighofer" <aschwaighofer at apple.com>,
> >> "llvm-commits" <llvm-commits at cs.uiuc.edu>
> >> Sent: Monday, December 16, 2013 3:05:26 PM
> >> Subject: Re: RFC: Enable vectorization of call instructions in the
> >> loop vectorizer
> >> 
> >> 
> >> Hi Hal,
> >> 
> >> 
> >> I can see the advantage of both approaches. Metadata, as you say,
> >> allows pragmas and frontends to pass information to the
> >> LoopVectorizer without target specific overrides (creating a new
> >> Target{Library,Transform}Info). Hooks in Target{L,T}Info allow
> >> greater control and greater complexity in choosing vectorized
> >> versions.
> >> 
> >> 
> >> Fundamentally I think the two approaches aren't compatible, so one
> >> needs to be chosen as the preferred route.
> > 
> > I think they are compatible, and we should have both. All we need
> > is some function that forms the list of alternatives that first
> > checks metadata, and then adds any alternatives provided by
> > TLibInfo. Why would this not work?
> 
> 
> I don’t think we need two interfaces (TargetLibInfo and meta data on
> the call site) for this. I see no reason why TargetLibraryInfo could
> not be initialized with metadata from the Module? If we can do this
> then we don’t need metadata on the function call and have an unified
> interface.

That also sounds good to me.

 -Hal

> 
> Best,
> Arnold
> 
> 
> > 
> > -Hal
> > 
> >> I actually don't mind
> >> which - to me it looks like Hal and Renato prefer metadata, and
> >> Nadav and Arnold prefer hooks, but I may have misinterpreted
> >> responses.
> >> 
> >> 
> >> Cheers,
> >> 
> >> 
> >> James
> >> 
> >> 
> >> 
> >> On 16 December 2013 20:59, Hal Finkel < hfinkel at anl.gov > wrote:
> >> 
> >> 
> >> 
> >> ----- Original Message -----
> >>> From: "Arnold Schwaighofer" < aschwaighofer at apple.com >
> >>> To: "James Molloy" < James.Molloy at arm.com >
> >>> Cc: "llvm-commits" < llvm-commits at cs.uiuc.edu >
> >>> Sent: Monday, December 16, 2013 12:03:02 PM
> >>> Subject: Re: RFC: Enable vectorization of call instructions in
> >>> the
> >>> loop vectorizer
> >>> 
> >>> 
> >> 
> >> 
> >>> On Dec 16, 2013, at 11:08 AM, James Molloy < James.Molloy at arm.com
> >>> >
> >>> wrote:
> >>> 
> >>>> Hi Renato, Nadav,
> >>>> 
> >>>> Attached is a proof of concept[1] patch for adding the ability
> >>>> to
> >>>> vectorize calls. The intended use case for this is in domain
> >>>> specific languages such as OpenCL where tuned implementation of
> >>>> functions for differing vector widths exist and can be
> >>>> guaranteed
> >>>> to be semantically the same as the scalar version.
> >>>> 
> >>>> I’ve considered two approaches to this. The first was to create
> >>>> a
> >>>> set of hooks that allow the LoopVectorizer to interrogate its
> >>>> client as to whether calls are vectorizable and if so, how.
> >>>> Renato
> >>>> argued that this was suboptimal as it required a client to
> >>>> invoke
> >>>> the LoopVectorizer manually and couldn’t be tested through opt.
> >>>> I
> >>>> agree.
> >>> 
> >>> I don’t understand this argument.
> >>> 
> >>> We could extend target library info with additional api calls to
> >>> query whether a function is vectorizable at a vector factor.
> >>> This can be tested by providing the target triple string (e.g
> >>> “target
> >>> triple = x86_64-gnu-linux-with_opencl_vector_lib") in the .ll
> >>> file
> >>> that informs the optimizer that a set of vector library calls is
> >>> available.
> >>> 
> >>> The patch seems to restrict legal vector widths dependent on
> >>> available vectorizable function calls. I don’t think this should
> >>> work like this.
> >>> I believe, there should be an api on TargetTransformInfo for
> >>> library
> >>> function calls. The vectorizer chooses the cheapest of either an
> >>> intrinsic call or a library function call.
> >>> The overall cost model determines which VF will be chosen.
> >> 
> >> We don't have a good model currently for non-intrinsic function
> >> calls. Once we do, we'll want to know how expensive the vectorized
> >> versions are compared to the scalar version. Short of that, I
> >> think
> >> that a reasonable approximation is that any function calls will be
> >> the most expensive things in a loop, and the ability to vectorize
> >> them will be the most important factor in determining the
> >> vectorization factor.
> >> 
> >> -Hal
> >> 
> >> 
> >>> 
> >>> Thanks,
> >>> Arnold
> >>> 
> >>> 
> >>> _______________________________________________
> >>> llvm-commits mailing list
> >>> llvm-commits at cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>> 
> >> 
> >> 
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> 
> >> 
> >> 
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> 
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory