[PATCH] D13710: New X86 FMA3*_Int opcodes for scalar FMA intrinsics.
Elena Demikhovsky via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 3 07:02:04 PST 2015
delena added inline comments.
================
Comment at: llvm/lib/Target/X86/X86InstrInfo.cpp:1815
@@ -1796,2 +1814,3 @@
{ X86::VFNMSUBSSr231r, X86::VFNMSUBSSr231m, TB_ALIGN_NONE },
+ { X86::VFNMSUBSSr231r_Int, X86::VFNMSUBSSr231m_Int, TB_ALIGN_NONE },
{ X86::VFNMSUBSDr231r, X86::VFNMSUBSDr231m, TB_ALIGN_NONE },
----------------
I don't understand how you can use the 231 form for scalar intrinsic:
intr_fmadd_ss( a, b, c) may be translated as
VFMADD213SS a, b, c
or
VFMADD132SS a, c, b
but you can't generate VFMADD231SS because "a" should go first, you are taking the upper part from it.
================
Comment at: llvm/test/CodeGen/X86/fma-intrinsics-phi-213-to-231.ll:171
@@ +170,3 @@
+; CHECK-NEXT: retq
+define <4 x float> @fmaddsubps_loop_128(i32 %iter, <4 x float> %a, <4 x float> %b, <4 x float> %c) {
+entry:
----------------
The test checks that FMA intrinsic gives the right form of FMA instruction.
I don't understand why do you need a loop here. We wrote a lot of FMA intrinsic tests without any loops.
================
Comment at: llvm/test/CodeGen/X86/fma-intrinsics-x86.ll:485
@@ +484,3 @@
+; CHECK-FMA-WIN-NEXT: vmovaps (%{{(rcx|rdx)}}), %xmm{{0|1}}
+; CHECK-FMA-WIN-NEXT: vfnmsub213sd (%r8), %xmm1, %xmm0
+;
----------------
you check folding vector load into scalar intrinsic.
On AVX-512 we support folding scalar load to scalar intrinsic., by matching scalar_to_vector(loadf32) pattern in td file
http://reviews.llvm.org/D13710
More information about the llvm-commits
mailing list