[PATCH] D50004: [PowerPC] Emit xscpsgndp instead of xxlor when copying floating point scalar registers for P9

Nemanja Ivanovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 8 08:02:21 PDT 2018


nemanjai added a comment.

In https://reviews.llvm.org/D50004#1187625, @jsji wrote:

>


<snip>

> No sure how much this will have impact, but maybe we need to consider still using xxlor for destructive instructions?
> 
> eg:
>  In Power ISA 3.0 B,  2.1.5 Destructive Operation Operand Preservation
>  "The set of instructions listed below, when immediately preceded by the xxlor XT,XC,XC instruction in a sequence similar to the above example, **will provide optimal performance.**"

I don't think there are any conditions under which we will emit an `xxlor` that will be eligible for this. That may be a good candidate to peephole and/or fuse together.
Example:

  vector double test(double a, vector double b, vector double c, vector double *s) {
    vector double n = (vector double)a;
    *s = n + c * b;
    return n;
  }

Is about as close as you can get, but will produce the following on Power9:

  xxspltd vs0, vs1, 0
  xxlor vs1, vs0, vs0
  xvmaddadp vs1, vs35, vs34
  xxlor vs34, vs0, vs0
  stxv vs1, 0(r9)

Ultimately, the target of the copy will always be used as an input to the destructive operation. If we want to exploit this optimization in the HW, we'd have to forward the source of the copy (and eliminate the second copy in this case as well). But if we're consciously transforming the code to exploit this, the instruction we use for the COPY is immaterial (we can always transform it to `XXLOR` at the time).


https://reviews.llvm.org/D50004





More information about the llvm-commits mailing list