[PATCH] D46536: [Power9]Legalize and emit code for W vector extract and convert to Quad-Precision
Lei Huang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 11 06:55:16 PDT 2018
lei added inline comments.
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3165
(EXTRACT_SUBREG (XXPERMDI $src, $src, 3), sub_64)))>;
- }
+
+ // (Un)Signed Word vector extract -> QP
----------------
nemanjai wrote:
> This probably needs `let Predicates = [IsBigEndian, HasP9Vector]` right?
This is within a `Predicates = [HasP9Vector]` section so is not needed here.
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3168
+ def : Pat<(f128 (sint_to_fp (i32 (extractelt v4i32:$src, 0)))),
+ (f128 (XSCVSDQP (EXTRACT_SUBREG (XVCVSXWDP $src), sub_64)))>;
+ foreach Idx = 1-3 in {
----------------
nemanjai wrote:
> Is this sequence actually correct? We convert a vector of 4 4-byte integers into a vector of 2 8-byte double precision floating point values. Then we treat it as a signed 8-byte integer and convert it to a 16-byte floating point value. Shouldn't the outer instruction be `xscvdpqp`?
>
> In any case, `vextsw2d -> xscvsdqp` is a much lower latency sequence than this. Why not use that?
I guess there is no need for us to convert to double precision here. Will update.
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3176
+ def : Pat<(f128 (uint_to_fp (i32 (extractelt v4i32:$src, Idx)))),
+ (f128 (XSCVUDQP (XXEXTRACTUW $src, !add(Idx, Idx, Idx, Idx))))>;
+ }
----------------
nemanjai wrote:
> Is there no `mul` function in TableGen? i.e. can't we just write `!mul(Idx, 4)`?
There is only !add(a,b,...)
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3194
(f128 (XSCVUDQP (COPY_TO_REGCLASS $src, VFRC)))>;
- }
+
+ // (Un)Signed Word vector extract -> QP
----------------
nemanjai wrote:
> Same note regarding the predicate.
this is within a `Predicates = [HasP9Vector]` code section.
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3196
+ // (Un)Signed Word vector extract -> QP
+ def : Pat<(f128 (sint_to_fp (i32 (extractelt v4i32:$src, 0)))),
+ (f128 (XSCVSDQP (EXTRACT_SUBREG
----------------
nemanjai wrote:
> To be consistent, I think you should write these as a neat for-loop as you did above. The element would be `Idx` and the splat index would be `!sub(3, Idx)`. Wouldn't that work?
Unfortunately there is no `!sub()` operator.
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:3207
+ (f128 (XSCVSDQP (EXTRACT_SUBREG (XVCVSXWDP $src), sub_64)))>;
+ def : Pat<(f128 (uint_to_fp (i32 (extractelt v4i32:$src, 0)))),
+ (f128 (XSCVUDQP (XXEXTRACTUW $src, 12)))>;
----------------
nemanjai wrote:
> Same thing here, a for-loop would be nicer and more consistent.
I agree. I just couldn't find a way to do it with just `!add()`
https://reviews.llvm.org/D46536
More information about the llvm-commits
mailing list