[llvm-commits] [llvm] r162999 - in /llvm/trunk: lib/Target/X86/Disassembler/X86Disassembler.cpp lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86InstrFMA.td test/CodeGen/X86/fma_patterns.ll utils/TableGen/X86RecognizableInstr.cpp

Michael Liao michael.liao at intel.com
Fri Aug 31 20:47:12 PDT 2012


On Sat, 2012-09-01 at 10:13 +0900, NAKAMURA Takumi wrote:
> I thought it might cover also fma-enabled hosts to add another checks. Thoughts?
> 

The last test case is broken with FMA enabled as FMA folding will
translate the root fsub first into fmsub and we have no chance to
translate from (fmul 2.0 x) to (fadd x x) or from (fadd x x) to (fmul
2.0 x).

In fact, it's quite questionable we do FMA folding such eagerly in
unsafe-math mode, where we have lots of chance to simplify expression
with associative/distributive properties.

- michael

> ...Takumi
> 
> 2012/9/1 Michael Liao <michael.liao at intel.com>:
> > On Sat, 2012-09-01 at 09:32 +0900, NAKAMURA Takumi wrote:
> >> Craig,
> >>
> >> It affects CodeGen/X86/fp-fast.ll on amd host. Tweaked in r163041.
> >> Please reconfirm.
> >
> > The following change may be better to prevent it fails again on Intel
> > host with FMA support.
> >
> > - michael
> >
> > diff --git a/test/CodeGen/X86/fp-fast.ll b/test/CodeGen/X86/fp-fast.ll
> > index 091f0de..1deb43d 100644
> > --- a/test/CodeGen/X86/fp-fast.ll
> > +++ b/test/CodeGen/X86/fp-fast.ll
> > @@ -1,4 +1,4 @@
> > -; RUN: llc -march=x86-64 -mattr=-fma4 -mtriple=x86_64-apple-darwin
> > -enable-unsafe-fp-math < %s | FileCheck %s
> > +; RUN: llc -march=x86-64 -mcpu=corei7 -mtriple=x86_64-apple-darwin
> > -enable-unsafe-fp-math < %s | FileCheck %s
> >
> >  ; CHECK: test1
> >  define float @test1(float %a) {
> >
> >>
> >> ...Takumi
> >>
> >> 2012/9/1 Craig Topper <craig.topper at gmail.com>:
> >> > Author: ctopper
> >> > Date: Fri Aug 31 10:40:30 2012
> >> > New Revision: 162999
> >> >
> >> > URL: http://llvm.org/viewvc/llvm-project?rev=162999&view=rev
> >> > Log:
> >> > Add support for converting llvm.fma to fma4 instructions.
> >> >
> >> > Modified:
> >> >     llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp
> >> >     llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> >> >     llvm/trunk/lib/Target/X86/X86InstrFMA.td
> >> >     llvm/trunk/test/CodeGen/X86/fma_patterns.ll
> >> >     llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp
> >> >
> >> > Modified: llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp
> >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp?rev=162999&r1=162998&r2=162999&view=diff
> >> > ==============================================================================
> >> > --- llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp (original)
> >> > +++ llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp Fri Aug 31 10:40:30 2012
> >> > @@ -379,6 +379,8 @@
> >> >    }
> >> >
> >> >    switch (type) {
> >> > +  case TYPE_XMM32:
> >> > +  case TYPE_XMM64:
> >> >    case TYPE_XMM128:
> >> >      mcInst.addOperand(MCOperand::CreateReg(X86::XMM0 + (immediate >> 4)));
> >> >      return;
> >> >
> >> > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=162999&r1=162998&r2=162999&view=diff
> >> > ==============================================================================
> >> > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> >> > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Fri Aug 31 10:40:30 2012
> >> > @@ -1052,7 +1052,7 @@
> >> >      setOperationAction(ISD::VSELECT,           MVT::v8i32, Legal);
> >> >      setOperationAction(ISD::VSELECT,           MVT::v8f32, Legal);
> >> >
> >> > -    if (Subtarget->hasFMA()) {
> >> > +    if (Subtarget->hasFMA() || Subtarget->hasFMA4()) {
> >> >        setOperationAction(ISD::FMA,             MVT::v8f32, Custom);
> >> >        setOperationAction(ISD::FMA,             MVT::v4f64, Custom);
> >> >        setOperationAction(ISD::FMA,             MVT::v4f32, Custom);
> >> > @@ -15606,7 +15606,8 @@
> >> >      return SDValue();
> >> >
> >> >    EVT ScalarVT = VT.getScalarType();
> >> > -  if ((ScalarVT != MVT::f32 && ScalarVT != MVT::f64) || !Subtarget->hasFMA())
> >> > +  if ((ScalarVT != MVT::f32 && ScalarVT != MVT::f64) ||
> >> > +      (!Subtarget->hasFMA() && !Subtarget->hasFMA4()))
> >> >      return SDValue();
> >> >
> >> >    SDValue A = N->getOperand(0);
> >> > @@ -15628,9 +15629,10 @@
> >> >
> >> >    unsigned Opcode;
> >> >    if (!NegMul)
> >> > -    Opcode = (!NegC)? X86ISD::FMADD : X86ISD::FMSUB;
> >> > +    Opcode = (!NegC) ? X86ISD::FMADD : X86ISD::FMSUB;
> >> >    else
> >> > -    Opcode = (!NegC)? X86ISD::FNMADD : X86ISD::FNMSUB;
> >> > +    Opcode = (!NegC) ? X86ISD::FNMADD : X86ISD::FNMSUB;
> >> > +
> >> >    return DAG.getNode(Opcode, dl, VT, A, B, C);
> >> >  }
> >> >
> >> >
> >> > Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td
> >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=162999&r1=162998&r2=162999&view=diff
> >> > ==============================================================================
> >> > --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original)
> >> > +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Fri Aug 31 10:40:30 2012
> >> > @@ -193,34 +193,57 @@
> >> >  //===----------------------------------------------------------------------===//
> >> >
> >> >
> >> > -multiclass fma4s<bits<8> opc, string OpcodeStr, Operand memop,
> >> > -                 ComplexPattern mem_cpat, Intrinsic Int> {
> >> > -  def rr : FMA4<opc, MRMSrcReg, (outs VR128:$dst),
> >> > -           (ins VR128:$src1, VR128:$src2, VR128:$src3),
> >> > +multiclass fma4s<bits<8> opc, string OpcodeStr, RegisterClass RC,
> >> > +                 X86MemOperand x86memop, ValueType OpVT, SDNode OpNode,
> >> > +                 PatFrag mem_frag> {
> >> > +  def rr : FMA4<opc, MRMSrcReg, (outs RC:$dst),
> >> > +           (ins RC:$src1, RC:$src2, RC:$src3),
> >> >             !strconcat(OpcodeStr,
> >> >             "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"),
> >> > -           [(set VR128:$dst,
> >> > -             (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, VEX_W, MemOp4;
> >> > -  def rm : FMA4<opc, MRMSrcMem, (outs VR128:$dst),
> >> > -           (ins VR128:$src1, VR128:$src2, memop:$src3),
> >> > +           [(set RC:$dst,
> >> > +             (OpVT (OpNode RC:$src1, RC:$src2, RC:$src3)))]>, VEX_W, MemOp4;
> >> > +  def rm : FMA4<opc, MRMSrcMem, (outs RC:$dst),
> >> > +           (ins RC:$src1, RC:$src2, x86memop:$src3),
> >> >             !strconcat(OpcodeStr,
> >> >             "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"),
> >> > -           [(set VR128:$dst,
> >> > -             (Int VR128:$src1, VR128:$src2, mem_cpat:$src3))]>, VEX_W, MemOp4;
> >> > -  def mr : FMA4<opc, MRMSrcMem, (outs VR128:$dst),
> >> > -           (ins VR128:$src1, memop:$src2, VR128:$src3),
> >> > +           [(set RC:$dst, (OpNode RC:$src1, RC:$src2,
> >> > +                           (mem_frag addr:$src3)))]>, VEX_W, MemOp4;
> >> > +  def mr : FMA4<opc, MRMSrcMem, (outs RC:$dst),
> >> > +           (ins RC:$src1, x86memop:$src2, RC:$src3),
> >> >             !strconcat(OpcodeStr,
> >> >             "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"),
> >> > -           [(set VR128:$dst,
> >> > -             (Int VR128:$src1, mem_cpat:$src2, VR128:$src3))]>;
> >> > +           [(set RC:$dst,
> >> > +             (OpNode RC:$src1, (mem_frag addr:$src2), RC:$src3))]>;
> >> >  // For disassembler
> >> >  let isCodeGenOnly = 1 in
> >> > -  def rr_REV : FMA4<opc, MRMSrcReg, (outs VR128:$dst),
> >> > -               (ins VR128:$src1, VR128:$src2, VR128:$src3),
> >> > +  def rr_REV : FMA4<opc, MRMSrcReg, (outs RC:$dst),
> >> > +               (ins RC:$src1, RC:$src2, RC:$src3),
> >> >                 !strconcat(OpcodeStr,
> >> >                 "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), []>;
> >> >  }
> >> >
> >> > +multiclass fma4s_int<bits<8> opc, string OpcodeStr, Operand memop,
> >> > +                     ComplexPattern mem_cpat, Intrinsic Int> {
> >> > +  def rr_Int : FMA4<opc, MRMSrcReg, (outs VR128:$dst),
> >> > +               (ins VR128:$src1, VR128:$src2, VR128:$src3),
> >> > +               !strconcat(OpcodeStr,
> >> > +               "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"),
> >> > +               [(set VR128:$dst,
> >> > +                 (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, VEX_W, MemOp4;
> >> > +  def rm_Int : FMA4<opc, MRMSrcMem, (outs VR128:$dst),
> >> > +               (ins VR128:$src1, VR128:$src2, memop:$src3),
> >> > +               !strconcat(OpcodeStr,
> >> > +               "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"),
> >> > +               [(set VR128:$dst, (Int VR128:$src1, VR128:$src2,
> >> > +                                  mem_cpat:$src3))]>, VEX_W, MemOp4;
> >> > +  def mr_Int : FMA4<opc, MRMSrcMem, (outs VR128:$dst),
> >> > +               (ins VR128:$src1, memop:$src2, VR128:$src3),
> >> > +               !strconcat(OpcodeStr,
> >> > +               "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"),
> >> > +               [(set VR128:$dst,
> >> > +                 (Int VR128:$src1, mem_cpat:$src2, VR128:$src3))]>;
> >> > +}
> >> > +
> >> >  multiclass fma4p<bits<8> opc, string OpcodeStr, SDNode OpNode,
> >> >                   ValueType OpVT128, ValueType OpVT256,
> >> >                   PatFrag ld_frag128, PatFrag ld_frag256> {
> >> > @@ -277,34 +300,47 @@
> >> >
> >> >  let Predicates = [HasFMA4] in {
> >> >
> >> > -defm VFMADDSS4    : fma4s<0x6A, "vfmaddss", ssmem, sse_load_f32,
> >> > -                          int_x86_fma_vfmadd_ss>;
> >> > -defm VFMADDSD4    : fma4s<0x6B, "vfmaddsd", sdmem, sse_load_f64,
> >> > -                          int_x86_fma_vfmadd_sd>;
> >> > +defm VFMADDSS4  : fma4s<0x6A, "vfmaddss", FR32, f32mem, f32, X86Fmadd, loadf32>,
> >> > +                  fma4s_int<0x6A, "vfmaddss", ssmem, sse_load_f32,
> >> > +                            int_x86_fma_vfmadd_ss>;
> >> > +defm VFMADDSD4  : fma4s<0x6B, "vfmaddsd", FR64, f64mem, f64, X86Fmadd, loadf64>,
> >> > +                  fma4s_int<0x6B, "vfmaddsd", sdmem, sse_load_f64,
> >> > +                            int_x86_fma_vfmadd_sd>;
> >> > +defm VFMSUBSS4  : fma4s<0x6E, "vfmsubss", FR32, f32mem, f32, X86Fmsub, loadf32>,
> >> > +                  fma4s_int<0x6E, "vfmsubss", ssmem, sse_load_f32,
> >> > +                            int_x86_fma_vfmsub_ss>;
> >> > +defm VFMSUBSD4  : fma4s<0x6F, "vfmsubsd", FR64, f64mem, f64, X86Fmsub, loadf64>,
> >> > +                  fma4s_int<0x6F, "vfmsubsd", sdmem, sse_load_f64,
> >> > +                            int_x86_fma_vfmsub_sd>;
> >> > +defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", FR32, f32mem, f32,
> >> > +                        X86Fnmadd, loadf32>,
> >> > +                  fma4s_int<0x7A, "vfnmaddss", ssmem, sse_load_f32,
> >> > +                            int_x86_fma_vfnmadd_ss>;
> >> > +defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", FR64, f64mem, f64,
> >> > +                        X86Fnmadd, loadf64>,
> >> > +                  fma4s_int<0x7B, "vfnmaddsd", sdmem, sse_load_f64,
> >> > +                            int_x86_fma_vfnmadd_sd>;
> >> > +defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", FR32, f32mem, f32,
> >> > +                        X86Fnmsub, loadf32>,
> >> > +                  fma4s_int<0x7E, "vfnmsubss", ssmem, sse_load_f32,
> >> > +                            int_x86_fma_vfnmsub_ss>;
> >> > +defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", FR64, f64mem, f64,
> >> > +                        X86Fnmsub, loadf64>,
> >> > +                  fma4s_int<0x7F, "vfnmsubsd", sdmem, sse_load_f64,
> >> > +                            int_x86_fma_vfnmsub_sd>;
> >> > +
> >> >  defm VFMADDPS4    : fma4p<0x68, "vfmaddps", X86Fmadd, v4f32, v8f32,
> >> >                            memopv4f32, memopv8f32>;
> >> >  defm VFMADDPD4    : fma4p<0x69, "vfmaddpd", X86Fmadd, v2f64, v4f64,
> >> >                            memopv2f64, memopv4f64>;
> >> > -defm VFMSUBSS4    : fma4s<0x6E, "vfmsubss", ssmem, sse_load_f32,
> >> > -                          int_x86_fma_vfmsub_ss>;
> >> > -defm VFMSUBSD4    : fma4s<0x6F, "vfmsubsd", sdmem, sse_load_f64,
> >> > -                          int_x86_fma_vfmsub_sd>;
> >> >  defm VFMSUBPS4    : fma4p<0x6C, "vfmsubps", X86Fmsub, v4f32, v8f32,
> >> >                            memopv4f32, memopv8f32>;
> >> >  defm VFMSUBPD4    : fma4p<0x6D, "vfmsubpd", X86Fmsub, v2f64, v4f64,
> >> >                            memopv2f64, memopv4f64>;
> >> > -defm VFNMADDSS4   : fma4s<0x7A, "vfnmaddss", ssmem, sse_load_f32,
> >> > -                          int_x86_fma_vfnmadd_ss>;
> >> > -defm VFNMADDSD4   : fma4s<0x7B, "vfnmaddsd", sdmem, sse_load_f64,
> >> > -                          int_x86_fma_vfnmadd_sd>;
> >> >  defm VFNMADDPS4   : fma4p<0x78, "vfnmaddps", X86Fnmadd, v4f32, v8f32,
> >> >                            memopv4f32, memopv8f32>;
> >> >  defm VFNMADDPD4   : fma4p<0x79, "vfnmaddpd", X86Fnmadd, v2f64, v4f64,
> >> >                            memopv2f64, memopv4f64>;
> >> > -defm VFNMSUBSS4   : fma4s<0x7E, "vfnmsubss", ssmem, sse_load_f32,
> >> > -                          int_x86_fma_vfnmsub_ss>;
> >> > -defm VFNMSUBSD4   : fma4s<0x7F, "vfnmsubsd", sdmem, sse_load_f64,
> >> > -                          int_x86_fma_vfnmsub_sd>;
> >> >  defm VFNMSUBPS4   : fma4p<0x7C, "vfnmsubps", X86Fnmsub, v4f32, v8f32,
> >> >                            memopv4f32, memopv8f32>;
> >> >  defm VFNMSUBPD4   : fma4p<0x7D, "vfnmsubpd", X86Fnmsub, v2f64, v4f64,
> >> >
> >> > Modified: llvm/trunk/test/CodeGen/X86/fma_patterns.ll
> >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fma_patterns.ll?rev=162999&r1=162998&r2=162999&view=diff
> >> > ==============================================================================
> >> > --- llvm/trunk/test/CodeGen/X86/fma_patterns.ll (original)
> >> > +++ llvm/trunk/test/CodeGen/X86/fma_patterns.ll Fri Aug 31 10:40:30 2012
> >> > @@ -1,9 +1,13 @@
> >> >  ; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=core-avx2 -mattr=avx2,+fma -fp-contract=fast | FileCheck %s
> >> >  ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=bdver2 -mattr=-fma4 -fp-contract=fast | FileCheck %s
> >> > +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=bdver1 -fp-contract=fast | FileCheck %s --check-prefix=CHECK_FMA4
> >> >
> >> >  ; CHECK: test_x86_fmadd_ps
> >> >  ; CHECK: vfmadd213ps     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmadd_ps
> >> > +; CHECK_FMA4: vfmaddps     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define <4 x float> @test_x86_fmadd_ps(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2) {
> >> >    %x = fmul <4 x float> %a0, %a1
> >> >    %res = fadd <4 x float> %x, %a2
> >> > @@ -13,6 +17,9 @@
> >> >  ; CHECK: test_x86_fmsub_ps
> >> >  ; CHECK: fmsub213ps     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmsub_ps
> >> > +; CHECK_FMA4: vfmsubps     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define <4 x float> @test_x86_fmsub_ps(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2) {
> >> >    %x = fmul <4 x float> %a0, %a1
> >> >    %res = fsub <4 x float> %x, %a2
> >> > @@ -22,6 +29,9 @@
> >> >  ; CHECK: test_x86_fnmadd_ps
> >> >  ; CHECK: fnmadd213ps     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fnmadd_ps
> >> > +; CHECK_FMA4: vfnmaddps     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define <4 x float> @test_x86_fnmadd_ps(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2) {
> >> >    %x = fmul <4 x float> %a0, %a1
> >> >    %res = fsub <4 x float> %a2, %x
> >> > @@ -31,6 +41,9 @@
> >> >  ; CHECK: test_x86_fnmsub_ps
> >> >  ; CHECK: fnmsub213ps     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fnmsub_ps
> >> > +; CHECK_FMA4: fnmsubps     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define <4 x float> @test_x86_fnmsub_ps(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2) {
> >> >    %x = fmul <4 x float> %a0, %a1
> >> >    %y = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %x
> >> > @@ -41,6 +54,9 @@
> >> >  ; CHECK: test_x86_fmadd_ps_y
> >> >  ; CHECK: vfmadd213ps     %ymm2, %ymm0, %ymm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmadd_ps_y
> >> > +; CHECK_FMA4: vfmaddps     %ymm2, %ymm1, %ymm0, %ymm0
> >> > +; CHECK_FMA4: ret
> >> >  define <8 x float> @test_x86_fmadd_ps_y(<8 x float> %a0, <8 x float> %a1, <8 x float> %a2) {
> >> >    %x = fmul <8 x float> %a0, %a1
> >> >    %res = fadd <8 x float> %x, %a2
> >> > @@ -50,6 +66,9 @@
> >> >  ; CHECK: test_x86_fmsub_ps_y
> >> >  ; CHECK: vfmsub213ps     %ymm2, %ymm0, %ymm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmsub_ps_y
> >> > +; CHECK_FMA4: vfmsubps     %ymm2, %ymm1, %ymm0, %ymm0
> >> > +; CHECK_FMA4: ret
> >> >  define <8 x float> @test_x86_fmsub_ps_y(<8 x float> %a0, <8 x float> %a1, <8 x float> %a2) {
> >> >    %x = fmul <8 x float> %a0, %a1
> >> >    %res = fsub <8 x float> %x, %a2
> >> > @@ -59,6 +78,9 @@
> >> >  ; CHECK: test_x86_fnmadd_ps_y
> >> >  ; CHECK: vfnmadd213ps     %ymm2, %ymm0, %ymm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fnmadd_ps_y
> >> > +; CHECK_FMA4: vfnmaddps     %ymm2, %ymm1, %ymm0, %ymm0
> >> > +; CHECK_FMA4: ret
> >> >  define <8 x float> @test_x86_fnmadd_ps_y(<8 x float> %a0, <8 x float> %a1, <8 x float> %a2) {
> >> >    %x = fmul <8 x float> %a0, %a1
> >> >    %res = fsub <8 x float> %a2, %x
> >> > @@ -78,6 +100,9 @@
> >> >  ; CHECK: test_x86_fmadd_pd_y
> >> >  ; CHECK: vfmadd213pd     %ymm2, %ymm0, %ymm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmadd_pd_y
> >> > +; CHECK_FMA4: vfmaddpd     %ymm2, %ymm1, %ymm0, %ymm0
> >> > +; CHECK_FMA4: ret
> >> >  define <4 x double> @test_x86_fmadd_pd_y(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2) {
> >> >    %x = fmul <4 x double> %a0, %a1
> >> >    %res = fadd <4 x double> %x, %a2
> >> > @@ -87,6 +112,9 @@
> >> >  ; CHECK: test_x86_fmsub_pd_y
> >> >  ; CHECK: vfmsub213pd     %ymm2, %ymm0, %ymm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmsub_pd_y
> >> > +; CHECK_FMA4: vfmsubpd     %ymm2, %ymm1, %ymm0, %ymm0
> >> > +; CHECK_FMA4: ret
> >> >  define <4 x double> @test_x86_fmsub_pd_y(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2) {
> >> >    %x = fmul <4 x double> %a0, %a1
> >> >    %res = fsub <4 x double> %x, %a2
> >> > @@ -96,6 +124,9 @@
> >> >  ; CHECK: test_x86_fmsub_pd
> >> >  ; CHECK: vfmsub213pd     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmsub_pd
> >> > +; CHECK_FMA4: vfmsubpd     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define <2 x double> @test_x86_fmsub_pd(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2) {
> >> >    %x = fmul <2 x double> %a0, %a1
> >> >    %res = fsub <2 x double> %x, %a2
> >> > @@ -105,6 +136,9 @@
> >> >  ; CHECK: test_x86_fnmadd_ss
> >> >  ; CHECK: vfnmadd213ss    %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fnmadd_ss
> >> > +; CHECK_FMA4: vfnmaddss    %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define float @test_x86_fnmadd_ss(float %a0, float %a1, float %a2) {
> >> >    %x = fmul float %a0, %a1
> >> >    %res = fsub float %a2, %x
> >> > @@ -114,6 +148,9 @@
> >> >  ; CHECK: test_x86_fnmadd_sd
> >> >  ; CHECK: vfnmadd213sd     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fnmadd_sd
> >> > +; CHECK_FMA4: vfnmaddsd     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define double @test_x86_fnmadd_sd(double %a0, double %a1, double %a2) {
> >> >    %x = fmul double %a0, %a1
> >> >    %res = fsub double %a2, %x
> >> > @@ -123,6 +160,9 @@
> >> >  ; CHECK: test_x86_fmsub_sd
> >> >  ; CHECK: vfmsub213sd     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fmsub_sd
> >> > +; CHECK_FMA4: vfmsubsd     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define double @test_x86_fmsub_sd(double %a0, double %a1, double %a2) {
> >> >    %x = fmul double %a0, %a1
> >> >    %res = fsub double %x, %a2
> >> > @@ -132,6 +172,9 @@
> >> >  ; CHECK: test_x86_fnmsub_ss
> >> >  ; CHECK: vfnmsub213ss     %xmm2, %xmm0, %xmm1
> >> >  ; CHECK: ret
> >> > +; CHECK_FMA4: test_x86_fnmsub_ss
> >> > +; CHECK_FMA4: vfnmsubss     %xmm2, %xmm1, %xmm0, %xmm0
> >> > +; CHECK_FMA4: ret
> >> >  define float @test_x86_fnmsub_ss(float %a0, float %a1, float %a2) {
> >> >    %x = fsub float -0.000000e+00, %a0
> >> >    %y = fmul float %x, %a1
> >> >
> >> > Modified: llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp
> >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp?rev=162999&r1=162998&r2=162999&view=diff
> >> > ==============================================================================
> >> > --- llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp (original)
> >> > +++ llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp Fri Aug 31 10:40:30 2012
> >> > @@ -1145,6 +1145,8 @@
> >> >    // register IDs in 8-bit immediates nowadays.
> >> >    ENCODING("VR256",           ENCODING_IB)
> >> >    ENCODING("VR128",           ENCODING_IB)
> >> > +  ENCODING("FR32",            ENCODING_IB)
> >> > +  ENCODING("FR64",            ENCODING_IB)
> >> >    errs() << "Unhandled immediate encoding " << s << "\n";
> >> >    llvm_unreachable("Unhandled immediate encoding");
> >> >  }
> >> >
> >> >
> >> > _______________________________________________
> >> > llvm-commits mailing list
> >> > llvm-commits at cs.uiuc.edu
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> >





More information about the llvm-commits mailing list