[PATCH] D72138: [ARM] Fill in FP16 FMA patterns
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 3 07:54:08 PST 2020
dmgreen marked an inline comment as done.
dmgreen added inline comments.
================
Comment at: llvm/lib/Target/ARM/ARMInstrVFP.td:2215
Requires<[HasVFP4]>;
+def : Pat<(f16 (fma (fneg HPR:$Sn), HPR:$Sm, (fneg HPR:$Sdin))),
+ (VFNMAH HPR:$Sdin, HPR:$Sn, HPR:$Sm)>,
----------------
samparker wrote:
> Do you think it would be worth doing some canonicalisation somewhere? With two fnegs I'm assuming this pattern is more expensive than the others if not caught.
I gave this a go, but it appears that the opposite is sometimes true. AMDGPU has instructions that look like `v_fma_f32 v0, -v0, v1, -v1`, where each of the operands can be inverted for free. If we inverted the whole thing, then we would need to add reverse patterns. RISCV seems to have some cases with fnmadd.s that are made worse too.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D72138/new/
https://reviews.llvm.org/D72138
More information about the llvm-commits
mailing list