[PATCH] D39851: [X86] Add separate intrinsics for scalar FMA4 instructions.

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 25 09:08:18 PST 2017


RKSimon added inline comments.


================
Comment at: lib/Target/X86/X86Subtarget.h:466
   // has equal or better performance on all supported targets.
-  bool hasFMA() const { return HasFMA && !HasFMA4; }
+  bool hasFMA() const { return HasFMA; }
   bool hasFMA4() const { return HasFMA4; }
----------------
This change concerns me - bdver2/bdver3 both support FMA3 as well as FMA4 but via a microcoding hack that costs extra cycles to perform, hence the preference for FMA4.


================
Comment at: test/CodeGen/X86/fma4-fneg-combine.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu -mattr=+fma4  | FileCheck %s
+
----------------
Add -mattr=+fma4,+fma tests as well?


================
Comment at: test/CodeGen/X86/fma4-intrinsics-x86.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+fma4,-fma -show-mc-encoding | FileCheck %s --check-prefix=CHECK
+
----------------
Add -mattr=+fma4,+fma tests as well?


https://reviews.llvm.org/D39851





More information about the llvm-commits mailing list