[PATCH] D46536: [Power9]Legalize and emit code for W vector extract and convert to Quad-Precision

Fri May 11 06:55:16 PDT 2018

lei added inline comments.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3165
                       (EXTRACT_SUBREG (XXPERMDI $src, $src, 3), sub_64)))>;
-  }
+
+    // (Un)Signed Word vector extract -> QP
----------------
nemanjai wrote:
> This probably needs `let Predicates = [IsBigEndian, HasP9Vector]` right?
This is within a `Predicates = [HasP9Vector]` section so is not needed here.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3168
+    def : Pat<(f128 (sint_to_fp (i32 (extractelt v4i32:$src, 0)))),
+              (f128 (XSCVSDQP (EXTRACT_SUBREG (XVCVSXWDP $src), sub_64)))>;
+    foreach Idx = 1-3 in {
----------------
nemanjai wrote:
> Is this sequence actually correct? We convert a vector of 4 4-byte integers into a vector of 2 8-byte double precision floating point values. Then we treat it as a signed 8-byte integer and convert it to a 16-byte floating point value. Shouldn't the outer instruction be `xscvdpqp`?
> 
> In any case, `vextsw2d -> xscvsdqp` is a much lower latency sequence than this. Why not use that?
I guess there is no need for us to convert to double precision here.  Will update.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3176
+      def : Pat<(f128 (uint_to_fp (i32 (extractelt v4i32:$src, Idx)))),
+                (f128 (XSCVUDQP (XXEXTRACTUW $src, !add(Idx, Idx, Idx, Idx))))>;
+    }
----------------
nemanjai wrote:
> Is there no `mul` function in TableGen? i.e. can't we just write `!mul(Idx, 4)`?
There is only !add(a,b,...)

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3194
               (f128 (XSCVUDQP (COPY_TO_REGCLASS $src, VFRC)))>;
-  }
+
+    // (Un)Signed Word vector extract -> QP
----------------
nemanjai wrote:
> Same note regarding the predicate.
this is within a `Predicates = [HasP9Vector]` code section.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3196
+    // (Un)Signed Word vector extract -> QP
+    def : Pat<(f128 (sint_to_fp (i32 (extractelt v4i32:$src, 0)))),
+              (f128 (XSCVSDQP (EXTRACT_SUBREG
----------------
nemanjai wrote:
> To be consistent, I think you should write these as a neat for-loop as you did above. The element would be `Idx` and the splat index would be `!sub(3, Idx)`. Wouldn't that work?
Unfortunately there is no `!sub()`  operator.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3207
+              (f128 (XSCVSDQP (EXTRACT_SUBREG (XVCVSXWDP $src), sub_64)))>;
+    def : Pat<(f128 (uint_to_fp (i32 (extractelt v4i32:$src, 0)))),
+              (f128 (XSCVUDQP (XXEXTRACTUW $src, 12)))>;
----------------
nemanjai wrote:
> Same thing here, a for-loop would be nicer and more consistent.
I agree.  I just couldn't find a way to do it with just `!add()`

https://reviews.llvm.org/D46536