RFC: Enable vectorization of call instructions in the loop vectorizer
listmail at philipreames.com
Thu Dec 19 09:07:00 PST 2013
On 12/17/13 2:06 PM, Renato Golin wrote:
> On 17 December 2013 19:10, Philip Reames <listmail at philipreames.com
> <mailto:listmail at philipreames.com>> wrote:
> Specifically with regards to metadata, why is the metadata on the
> call site not the function declaration? I would expect that a
> call to "cos" would always vectorize to the same "cos4". Is
> supporting different vectorizations at different points of the
> program a key goal?
> Yes. Sometimes...
> See the test in the patch for an idea of two different functions,
> maybe both available, maybe not.
> Also, some variants could be more efficient in some cases, while some
> in others. This may sound vague, but OpenCL has such a large number of
> functions that I'd be surprised if there were only simple cases...
> James can give more concrete examples on where this is important.
> In a strictly libc case, you may create several variants of memcpy
> based on the arguments (restrict or volatile, address space
> boundaries, etc), and apply the fastest you can on each case.
I acknowledge your point, but want to point out that the specific
example you've given is function specific, not call site specific.
You've described a property of a given function in terms of how it's
called, but the description is of behavior related to the function.
(i.e. call calls to memcpy have the same rules applied)
I'm not saying there aren't cases where having the per-call site detail
is useful; I'm just concerned it's overkill for the common case. It
also requires a much more complicated frontend. All code which might
want to be vectorized needs to know about the vectorization semantics.
With a per function annotation (attribute, metadata, whatever..), only
the part that generates the function declaration would need to preserve
Just to be clear, I'm not /opposing/ the current solution. If this is
what the interested parties want to run with, that's fine. I'm simply
pointing out some downsides to the proposed design.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits