[PATCH] D50004: [PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar registers for P9

Nemanja Ivanovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 10 05:57:46 PDT 2018


nemanjai added a comment.

In https://reviews.llvm.org/D50004#1194086, @jsji wrote:

>


<snip>

> diff  of assembly before change and after change:
> 
>   $ diff -Naur before.s after.s 
>   --- before.s    2018-08-09 13:52:24.785246846 -0400
>   +++ after.s     2018-08-09 13:59:38.815493708 -0400
>   @@ -9,7 +9,7 @@
>    # %bb.0:                                # %entry
>           lfd f0, 304(r1)
>           lxsd v2, 312(r1)
>   -       xxlor v3, f1, f1
>   +      xscpsgndp v3, f1, f1
>           xsmaddadp v3, v2, f0
>           xsadddp f0, v3, f2
>           xsmaddadp f0, v3, v2
>   

This is exactly what I was referring to... The situation you describe is not analogous to the situation described in the ISA. According to the ISA, the sequence that will be optimized is:

  xxlor XC, XT, XT
  xxperm XT, XA, XB

So in this case, the only way we would get the optimized behaviour would be if the "pre-patch" code sequence was:

  xxlor v3, f1, f1
  xsmaddadp f1, v2, f0

And I'm fairly certain that without source forwarding of the copy, we can never produce such code (but of course, I could be wrong).


https://reviews.llvm.org/D50004





More information about the llvm-commits mailing list