[PATCH] D14909: [X86][FMA] Optimize FNEG(FMUL) Patterns
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 25 08:59:16 PST 2015
spatel added a comment.
In http://reviews.llvm.org/D14909#296372, @delena wrote:
> AVX512 always contains FMA, you don't need a special check.
I see what's going on now. This patch needs the AVX512 check in order to work for 512-bit vectors. There should be at least one more test case included with this patch to check the 512-bit vector case. Alternatively, if we want to make 512-bit optimization a separate patch, the predicate should disallow those types.
So I see 2 bugs here. Test case:
define <16 x float> @test_v4f32_fneg_fmul_16(<16 x float> %x, <16 x float> %y) #0 {
%m = fmul <16 x float> %x, %y
%n = fsub <16 x float> <float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0>, %m
ret <16 x float> %n
}
Bug #1: The AVX-512 attribute in the subtarget should imply hasFMA(). Currently, this isn't true, so if you don't explicitly specify FMA, you won't get any expected FMA transforms:
$ ./llc fma.ll -o - -mattr=avx512f
...
vmulps %zmm1, %zmm0, %zmm0
vpxord LCPI0_0(%rip), %zmm0, %zmm0
retq
$ ./llc fma.ll -o - -mattr=avx512f,fma
...
vpxord %zmm2, %zmm2, %zmm2
vfnmadd213ps %zmm2, %zmm1, %zmm0
retq
Bug #2: This patch isn't checking for legal types, so it will fire for a 512-bit vector on a target that doesn't support it, and we crash:
$ ./llc fma.ll -o - -mattr=fma
Do not know how to custom type legalize this operation!
UNREACHABLE executed at /Users/spatel/myllvm/llvm/lib/Target/X86/X86ISelLowering.cpp:19880!
This would also happen if the test case had an illegal type for any target; eg <17 x float>.
I think the best fix for #2 is to explicitly check for the supported types for a given subtarget rather than waiting for type legalization and then checking the scalar type. This should be a helper function, so we can have the existing code in PerformFMACombine() use it too.
Repository:
rL LLVM
http://reviews.llvm.org/D14909
More information about the llvm-commits
mailing list