[PATCH] D13269: Improved X86-FMA3 mem-folding & coalescing
Vyacheslav Klochkov via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 1 15:23:08 PDT 2015
v_klochkov updated this revision to Diff 36312.
v_klochkov added a comment.
Ahmed, Quentin,
Thank you for the quick code-review.
I am ok with having the correctness fix for FMAs to be arranged as a separate change-set.
The correctness fix is removed from this change-set.
Also, I did some additional changes + renaming + documenting in
getFMA3OpcodeToCommuteOperands() to make the code look simpler.
I would like to land this fix and then to work on the correctness problem
that exists for scalar FMA intrinsics.
The simplest way is to add *_Int opcodes.
I am not sure I understood this idea
("adding a subregister for FP32 from the VR128 and use insert_subreg")
If there is a precedence (i.e. some similar scalar SIMD instruction) where that approach is used,
then I can try using that approach for FMAs.
Thank you,
Slava
http://reviews.llvm.org/D13269
Files:
llvm/lib/Target/X86/X86InstrFMA.td
llvm/lib/Target/X86/X86InstrInfo.cpp
llvm/lib/Target/X86/X86InstrInfo.h
llvm/test/CodeGen/X86/fma-commute-x86.ll
llvm/test/CodeGen/X86/fma_patterns.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D13269.36312.patch
Type: text/x-patch
Size: 44796 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151001/09814c84/attachment.bin>
More information about the llvm-commits
mailing list