[PATCH] D19825: Power9 - Add exploitation of vector load and store that do not require swaps
Nemanja Ivanovic via llvm-commits
llvm-commits at lists.llvm.org
Wed May 4 09:20:39 PDT 2016
nemanjai added inline comments.
================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:10527
@@ -10526,2 +10526,3 @@
// For little endian, VSX stores require generating xxswapd/lxvd2x.
+ // Not needed on P9 since we have a load that lines things up correctly.
EVT VT = N->getOperand(1).getValueType();
----------------
echristo wrote:
> This comment and the isISA3_0 don't quite match up. At least I'm assuming that there may be more isa3.0 processors other than power9? If not, then why the feature :)
>
> It also seems like this could be factored in a different way since you replicate it a bunch of times.
I'll change the comment to:
Not needed on CPUs that implement ISA 3.0 since we have a load that lines things up correctly.
I'll define a local var as follows and use it in conditions:
bool NeedsSwapsForVSXMemOps = Subtarget.hasVSX() && Subtarget.isLittleEndian() && !Subtarget.isISA3_0();
================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:2159
@@ +2158,3 @@
+
+ let AddedComplexity = 500 in {
+ def : Pat<(v2f64 (load xoaddr:$src)), (LXVX xoaddr:$src)>;
----------------
echristo wrote:
> nemanjai wrote:
> > I'll add to the readme that patterns can be added to emit lxvd2x and friends in cases where we happen to want the elements in the reverse order (i.e. something like the vec_xl use as well as if the load is followed by a vector_shuffle that will reverse the elements).
> Eh? What's with the AddedComplexity here?
I need these patterns to be preferred over the ones that use LXVD2X when we are on a CPU that implements ISA 3.0. This is the hack that we use in order to favour VSX instructions over other choices, so I just took it a step further to favour specific VSX instructions.
Repository:
rL LLVM
http://reviews.llvm.org/D19825
More information about the llvm-commits
mailing list