[PATCH] D14762: X86-FMA3: Memory folding for scalar loads + FMA3

Tue Nov 17 15:15:19 PST 2015

v_klochkov created this revision.
v_klochkov added a reviewer: DavidKreitzer.
v_klochkov added subscribers: llvm-commits, qcolombet.

Hello,

Please review the patch that enables memory folding optimization for 
sequences like this:

  #include <immintrin.h>
  double mem;
  __m128d func(__m128d a, __m128d b) {
    __m128d m = _mm_load_sd(&mem);
    return _mm_fmadd_sd(a, b, m);
  }

Code without the patch (clang -O3 -S):
  func:                                   # @func
          .cfi_startproc
  # BB#0:                                 # %entry
          movsd   mem(%rip), %xmm2        # xmm2 = mem[0],zero
          vfmadd213sd     %xmm2, %xmm1, %xmm0
          retq

Code with the patch:
  func:                                   # @func
          .cfi_startproc
  # BB#0:                                 # %entry
          vfmadd213sd     mem(%rip), %xmm1, %xmm0
          retq

The load can be folded into 2nd or 3rd operand of FMA*_Int instruction.
The newly added test fma-scalar-memfold.ll checks memory folding for both of operands.

lib/Target/X86/X86InstrFMA.td:
  Removed the redundant register to register moves.
  Memory folding does not work with those moves.
  // TODO: perhaps, the register-to-register moves can be just stripped in such/some cases,
  // but that is a separate optimization/change-set.

lib/Target/X86/X86InstrInfo.cpp:
  Added the FMA*_Int opcodes to the routine
  isNonFoldablePartialRegisterLoad()

test/CodeGen/X86/fma-scalar-memfold.ll:
  New test. Checks that result of _mm_load_{s,d}() can be folded into 2nd or 3rd operand of FMA*_Int.

Thank you,
Slava

http://reviews.llvm.org/D14762

Files:
  llvm/lib/Target/X86/X86InstrFMA.td
  llvm/lib/Target/X86/X86InstrInfo.cpp
  llvm/test/CodeGen/X86/fma-scalar-memfold.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D14762.40441.patch
Type: text/x-patch
Size: 18565 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151117/31d063b5/attachment.bin>