RFC: Enable vectorization of call instructions in the loop vectorizer

Mon Dec 16 12:53:04 PST 2013

----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "James Molloy" <james.molloy at arm.com>
> Cc: "llvm-commits at cs.uiuc.edu for LLVM" <llvm-commits at cs.uiuc.edu>
> Sent: Monday, December 16, 2013 11:26:31 AM
> Subject: Re: RFC: Enable vectorization of call instructions in the loop	vectorizer
> 
> 
> 
> Hi James!
> 
> 
> Thanks for working on this.
> 
> Attached is a proof of concept[1] patch for adding the ability to
> vectorize calls. The intended use case for this is in domain
> specific languages such as OpenCL where tuned implementation of
> functions for differing vector widths exist and can be guaranteed to
> be semantically the same as the scalar version.
> 
> 
> Excellent!
> 
> 
> 
> 
> 
> 
> I’ve considered two approaches to this. The first was to create a set
> of hooks that allow the LoopVectorizer to interrogate its client as
> to whether calls are vectorizable and if so, how. Renato argued that
> this was suboptimal as it required a client to invoke the
> LoopVectorizer manually and couldn’t be tested through opt. I agree.
> 
> So the version attached reads metadata attached to CallInsts. The
> schema for the metadata is detailed in the proposed LangRef
> addition, but basically it describes a list of potential
> vectorization candidates. Each candidate has a vector width, a
> llvm::Function* (or MDString) giving the target function and a
> string describing how the function arguments need to be handled.
> 
> 
> I think that this kind of logic should go into TargetLibraryInfo, and
> not as part of the vectorizers. The vectorizers should use some kind
> of API that will translate cos into cos4. I don’t like the
> vectorizer.call metadata because it does not solve the general
> problem. Yes, it will allow the vectorization of some OpenCL
> functions but it will not help the vectorization of other math
> functions in regular C loops.
> 
> 

I agree with this only in part. We need to be thinking about add-on libraries here, not just ones that are part of the system environment itself. In the end, I should be able to write some vectorized cos function, for example, and then using some pragma and/or attribute, instruct the backend that this is a vectorized version of ::cos(double). In this regard, I rather like the metadata. That having been said, TargetLibraryInfo certainly should provide defaults when system-level implementations are known to be available.

 -Hal

> 
> 
> 
> 
> 
> 
> The mangled function arguments string allows us to handle
> vectorizations beyond just the pure “vectorize every argument”
> scenario. Consider for example the statement “a = clamp(b, 2.0f);”.
> OpenCL provides two forms of “clamp2” – one with the second argument
> a vector and one with the second argument a scalar. It is quite
> possible that the scalar form is more optimal, and should be
> selected if the second argument is uniform.
> 
> 
> I understand this problem but I don’t want to add OpenCL-specific
> knowledge into the loop-vectorizer. One possible solution would be
> to work around this problem by extending your OpenCL library and by
> introducing OpenCL-specific passes that will detect these kind of
> patterns and optimize your vectorized code using the OpenCL
> knowledge.
> 
> 
> Thanks,
> Nadav
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory