<div dir="ltr">[I added this to Phab over an hour ago, but it hasn't made it to the list afaict, so repeating with an email.]<br><br><span class=""><p>I

 see what's going on now. This patch needs the AVX512 check in order to 

work for 512-bit vectors. There should be at least one more test case 

included with this patch to check the 512-bit vector case. 

Alternatively, if we want to make 512-bit optimization a separate patch,

 the predicate should disallow those types.</p>

<p>So I see 2 bugs here. Test case:</p>

<div class=""><pre class="">define <16 x float> @test_v4f32_fneg_fmul_16(<16 x float> %x, <16 x float> %y) #0 {

  %m = fmul <16 x float> %x, %y

  %n = fsub <16 x float> <float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0, float -0.0>, %m

  ret <16 x float> %n

}</pre></div>

<p>Bug #1: The AVX-512 attribute in the subtarget should imply hasFMA().

 Currently, this isn't true, so if you don't explicitly specify FMA, you

 won't get any expected FMA transforms:</p>

<div class=""><pre class=""><span class="">$ ./llc fma.ll -o - -mattr=avx512f</span>

<span class="">...</span>

<span class="">   vmulps        %zmm1, %zmm0, %zmm0</span>

<span class="">   vpxord        LCPI0_0(%rip), %zmm0, %zmm0</span>

<span class="">   retq</span>

<span class=""></span>

<span class="">$ ./llc fma.ll -o - -mattr=avx512f,fma</span>

<span class="">...</span>

<span class="">  vpxord %zmm2, %zmm2, %zmm2</span>

<span class="">  vfnmadd213ps   %zmm2, %zmm1, %zmm0</span>

<span class="">  retq</span></pre></div>

<p>Bug #2: This patch isn't checking for legal types, so it will fire 

for a 512-bit vector on a target that doesn't support it, and we crash:</p>

<div class=""><pre class=""><span class="">$ ./llc fma.ll -o - -mattr=fma</span>

<span class="">  Do not know how to custom type legalize this operation!</span>

<span class="">  UNREACHABLE executed at /Users/spatel/myllvm/llvm/lib/Target/X86/X86ISelLowering.cpp:19880!</span></pre></div>

<p>This would also happen if the test case had an illegal type for any target; eg <17 x float>.</p>

<p>I think the best fix for #2 is to explicitly check for the supported 

types for a given subtarget rather than waiting for type legalization 

and then checking the scalar type. This should be a helper function, so 

we can have the existing code in PerformFMACombine() use it too.</p></span><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 25, 2015 at 7:29 AM, Simon Pilgrim <span dir="ltr"><<a href="mailto:llvm-dev@redking.me.uk" target="_blank">llvm-dev@redking.me.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">RKSimon marked 3 inline comments as done.<br>

RKSimon added a comment.<br>

<div class="HOEnZb"><div class="h5"><br>

Repository:<br>

  rL LLVM<br>

<br>

<a href="http://reviews.llvm.org/D14909" rel="noreferrer" target="_blank">http://reviews.llvm.org/D14909</a><br>

<br>

<br>

<br>

</div></div></blockquote></div><br></div>