[llvm-dev] LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Neil Henning via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 17 04:51:50 PDT 2020


Oh interesting - I hadn't even considered registering vector descriptors
for the LLVM intrinsics, but right enough when I just registered that pow
has a vector variant (itself of a bigger size) I got the correct 8-wide
variants like I was expecting - nice!

Thanks for the help!

Cheers,
-Neil.

On Fri, Jul 17, 2020 at 12:09 PM Florian Hahn <florian_hahn at apple.com>
wrote:

>
>
> On 16 Jul 2020, at 19:54, Neil Henning via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> So for us we use SLEEF to actually implement the libcalls (LLVM
> intrinsics) that LLVM by default would generate - and since SLEEF has
> highly optimal 8-wide pow, optimized for AVX and AVX2, we really want to
> use that.
>
>
> Right, the way vector versions of library functions are accessed by the
> vectoriser has changed since the last release. I think the initial patch
> was https://reviews.llvm.org/D70107.
>
> Vector functions now must be annotated with a vector-function-abi-variant
> function attribute. There’s the -inject-tli-mappings pass, that is supposed
> to add the attributes for vector functions from TLI. It seems like this is
> currently not happening for your custom TLI mappings for some reason.
>
> For example, the Accelerate library has a vector version of log10. Running
> `opt -vector-library=Accelerate -inject-tli-mappings` on the IR below will
> add the following attribute to the llvm.log10 call-site, indicating that
> there’s a <4 x float> version of log10 called vlog10f.
>
> { "vector-function-abi-variant"="_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)" }
>
>
> To double-check, if running -inject-tli-mappings on your example does not
> add the vector-function-abi-variant attribute for `pow`, the vectorisers
> won’t know about them. If the vector-function-abi-variant attribute is
> actually created, but the vector version is not used nonetheless, it would
> be great if you could share the IR with the attributes, as they depend on
> the downstream TLI.
>
> I am also CC’ing Francesco, who might be able to help you pinning down
> where exactly things go wrong with the mapping.
>
> Cheers,
> Florian
>
> ——
>
> define float @call_llvm.log10.f32(float %in) {
>   %call = tail call float @llvm.log10.f32(float %in)
>   ret float %call
> }
>
> declare float @llvm.log10.f32(float)
>


-- 
Neil Henning
Senior Software Engineer Compiler
unity.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200717/e903ada7/attachment.html>


More information about the llvm-dev mailing list