[PATCH] D50004: [PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar registers for P9
Nemanja Ivanovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 10 05:57:46 PDT 2018
nemanjai added a comment.
In https://reviews.llvm.org/D50004#1194086, @jsji wrote:
>
<snip>
> diff of assembly before change and after change:
>
> $ diff -Naur before.s after.s
> --- before.s 2018-08-09 13:52:24.785246846 -0400
> +++ after.s 2018-08-09 13:59:38.815493708 -0400
> @@ -9,7 +9,7 @@
> # %bb.0: # %entry
> lfd f0, 304(r1)
> lxsd v2, 312(r1)
> - xxlor v3, f1, f1
> + xscpsgndp v3, f1, f1
> xsmaddadp v3, v2, f0
> xsadddp f0, v3, f2
> xsmaddadp f0, v3, v2
>
This is exactly what I was referring to... The situation you describe is not analogous to the situation described in the ISA. According to the ISA, the sequence that will be optimized is:
xxlor XC, XT, XT
xxperm XT, XA, XB
So in this case, the only way we would get the optimized behaviour would be if the "pre-patch" code sequence was:
xxlor v3, f1, f1
xsmaddadp f1, v2, f0
And I'm fairly certain that without source forwarding of the copy, we can never produce such code (but of course, I could be wrong).
https://reviews.llvm.org/D50004
More information about the llvm-commits
mailing list