RFC: Enable vectorization of call instructions in the loop vectorizer

Mon Dec 16 15:48:09 PST 2013

----- Original Message -----
> From: "Renato Golin" <renato.golin at linaro.org>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Arnold Schwaighofer" <aschwaighofer at apple.com>
> Sent: Monday, December 16, 2013 4:21:10 PM
> Subject: Re: RFC: Enable vectorization of call instructions in the loop vectorizer
> 
> 
> 
> 
> On 16 December 2013 22:02, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> I think we're okay here for the time being ;) -- Arnold suggested
> that we use some large value to represent the cost of the scalar
> function. The metadata (or whatever) just need to specify some
> multiplicative factor by which we scale that cost. In general, I
> think that should be fairly easy to determine.
> 
> 
> 
> Yes, the question was more rhetorical than anything else. I think the
> standard cost of "call / width" is a good one, and domain-specific
> decisions can be taken in the call-backs.

To be clear, I'd strongly prefer that we use that as the default, but also provide a scaling factor argument in the metadata. On my BG/Q platform, for example, I only have vector floating point (no integer ops). As a result, some of the 'vectorized' math calls have speedups that differ significantly from the call / width relative speedup (depending on their integer/fp mix).

 -Hal

> 
> 
> cheers,
> --renato

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory