[PATCH] D62908: [PowerPC] Improve float vector gather codegen
Jinsong Ji via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 30 20:58:07 PDT 2019
jsji requested changes to this revision.
jsji added a comment.
This revision now requires changes to proceed.
My understanding is that the cycle saving come from avoiding unnecessary SP->DP, then DP->SP conversion.
But not the difference merge sequence.
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:2391
dag I32_TO_BE_WORD1 = (COPY_TO_REGCLASS (MTVSRWZ $B), VSRC);
+ dag LD32A = (COPY_TO_REGCLASS (LIWZX xoaddr:$A), VSRC);
+ dag LD32B = (COPY_TO_REGCLASS (LIWZX xoaddr:$B), VSRC);
----------------
Why these dag belongs to `AlignValues`? Why not `MrgFP`?
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:3982
(COPY_TO_REGCLASS $A, VSRC), 0))>;
-
+ def : Pat<(v4f32 (build_vector (f32 (load xoaddr:$A)),
+ (f32 (load xoaddr:$B)),
----------------
What about BigEndian?
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:3986
+ (f32 (load xoaddr:$D)))),
+ (v4f32 (XXPERMDI (XXMRGHW AlignValues.LD32D, AlignValues.LD32C),
+ (XXMRGHW AlignValues.LD32B, AlignValues.LD32A), 3))>;
----------------
What is the benefit of merging 'AB', 'CD', instead of original 'AC', 'BD' then `vmrgew`?
`vmrgew` is 2 cycle ALU instruction, should still be better than 3 cycler `xxpermdi` here.
================
Comment at: llvm/test/CodeGen/PowerPC/float-vector-gather.ll:3
+; RUN: llc -verify-machineinstrs -mcpu=pwr9 -ppc-vsr-nums-as-vr \
+; RUN: -ppc-asm-full-reg-names -mtriple=powerpc64le-unknown-linux-gnu < %s \
+; RUN: | FileCheck %s
----------------
Add test for Big endian as well please.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D62908/new/
https://reviews.llvm.org/D62908
More information about the llvm-commits
mailing list