[PATCH] D32596: [DAGCombine] Transform (fmul X, -2.0) --> (fneg (fadd X, X)).
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 2 15:21:21 PDT 2017
spatel added inline comments.
================
Comment at: test/CodeGen/X86/fmul-combines.ll:21-22
+; CHECK-LABEL: fmulneg2_v4f32:
+; CHECK: addps %xmm0, %xmm0
+; CHECK: xorps
+; CHECK-NEXT: retq
----------------
What we don't see in this check, but you probably know or can infer: x86 doesn't have an 'fneg' op for SSE/AVX (they ran out of transistors?).
So we load the 128-bit sign-bit mask from memory:
xorps LCPI1_0(%rip), %xmm0
It's also true that the mul version would load a '2.0', but this adds an extra op, and I don't think that's good for any x86 target.
There's one other reason this may not be good: there are actually CPUs (hello, Jaguar!) that have faster FP multiplies than FP adds.
https://reviews.llvm.org/D32596
More information about the llvm-commits
mailing list