[PATCH] D14909: [X86][FMA] Optimize FNEG(FMUL) Patterns
    Sanjay Patel via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Wed Nov 25 08:59:16 PST 2015
    
    
  
spatel added a comment.
In http://reviews.llvm.org/D14909#296372, @delena wrote:
> AVX512 always contains FMA, you don't need a special check.
I see what's going on now. This patch needs the AVX512 check in order to work for 512-bit vectors. There should be at least one more test case included with this patch to check the 512-bit vector case. Alternatively, if we want to make 512-bit optimization a separate patch, the predicate should disallow those types.
So I see 2 bugs here. Test case:
  define <16 x float> @test_v4f32_fneg_fmul_16(<16 x float> %x, <16 x float> %y) #0 {
    %m = fmul <16 x float> %x, %y
    %n = fsub <16 x float> <float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0>, %m
    ret <16 x float> %n
  }
Bug #1: The AVX-512 attribute in the subtarget should imply hasFMA(). Currently, this isn't true, so if you don't explicitly specify FMA, you won't get any expected FMA transforms:
  $ ./llc fma.ll -o - -mattr=avx512f
  ...
     vmulps	%zmm1, %zmm0, %zmm0
     vpxord	LCPI0_0(%rip), %zmm0, %zmm0
     retq
  
  $ ./llc fma.ll -o - -mattr=avx512f,fma
  ...
    vpxord	%zmm2, %zmm2, %zmm2
    vfnmadd213ps	%zmm2, %zmm1, %zmm0
    retq
Bug #2: This patch isn't checking for legal types, so it will fire for a 512-bit vector on a target that doesn't support it, and we crash:
  $ ./llc fma.ll -o - -mattr=fma
    Do not know how to custom type legalize this operation!
    UNREACHABLE executed at /Users/spatel/myllvm/llvm/lib/Target/X86/X86ISelLowering.cpp:19880!
This would also happen if the test case had an illegal type for any target; eg <17 x float>.
I think the best fix for #2 is to explicitly check for the supported types for a given subtarget rather than waiting for type legalization and then checking the scalar type. This should be a helper function, so we can have the existing code in PerformFMACombine() use it too.
Repository:
  rL LLVM
http://reviews.llvm.org/D14909
    
    
More information about the llvm-commits
mailing list