[PATCH] D13269: Improved X86-FMA3 mem-folding & coalescing

Thu Oct 1 15:23:08 PDT 2015

v_klochkov updated this revision to Diff 36312.
v_klochkov added a comment.

Ahmed, Quentin,
Thank you for the quick code-review.

I am ok with having the correctness fix for FMAs to be arranged as a separate change-set.
The correctness fix is removed from this change-set.

Also, I did some additional changes + renaming + documenting in
getFMA3OpcodeToCommuteOperands() to make the code look simpler.

I would like to land this fix and then to work on the correctness problem
that exists for scalar FMA intrinsics.

The simplest way is to add *_Int opcodes.

I am not sure I understood this idea

  ("adding a subregister for FP32 from the VR128 and use insert_subreg")

If there is a precedence (i.e. some similar scalar SIMD instruction) where that approach is used,
then I can try using that approach for FMAs.

Thank you,
Slava

http://reviews.llvm.org/D13269

Files:
  llvm/lib/Target/X86/X86InstrFMA.td
  llvm/lib/Target/X86/X86InstrInfo.cpp
  llvm/lib/Target/X86/X86InstrInfo.h
  llvm/test/CodeGen/X86/fma-commute-x86.ll
  llvm/test/CodeGen/X86/fma_patterns.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13269.36312.patch
Type: text/x-patch
Size: 44796 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151001/09814c84/attachment.bin>