[PATCH] D19825: Power9 - Add exploitation of vector load and store that do not require swaps

Nemanja Ivanovic via llvm-commits llvm-commits at lists.llvm.org
Wed May 4 09:20:39 PDT 2016


nemanjai added inline comments.

================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:10527
@@ -10526,2 +10526,3 @@
     // For little endian, VSX stores require generating xxswapd/lxvd2x.
+    // Not needed on P9 since we have a load that lines things up correctly.
     EVT VT = N->getOperand(1).getValueType();
----------------
echristo wrote:
> This comment and the isISA3_0 don't quite match up. At least I'm assuming that there may be more isa3.0 processors other than power9? If not, then why the feature :)
> 
> It also seems like this could be factored in a different way since you replicate it a bunch of times.
I'll change the comment to:
Not needed on CPUs that implement ISA 3.0 since we have a load that lines things up correctly.

I'll define a local var as follows and use it in conditions:
bool NeedsSwapsForVSXMemOps = Subtarget.hasVSX() && Subtarget.isLittleEndian() && !Subtarget.isISA3_0();

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:2159
@@ +2158,3 @@
+
+  let AddedComplexity = 500 in {
+    def : Pat<(v2f64 (load xoaddr:$src)), (LXVX xoaddr:$src)>;
----------------
echristo wrote:
> nemanjai wrote:
> > I'll add to the readme that patterns can be added to emit lxvd2x and friends in cases where we happen to want the elements in the reverse order (i.e. something like the vec_xl use as well as if the load is followed by a vector_shuffle that will reverse the elements).
> Eh? What's with the AddedComplexity here?
I need these patterns to be preferred over the ones that use LXVD2X when we are on a CPU that implements ISA 3.0. This is the hack that we use in order to favour VSX instructions over other choices, so I just took it a step further to favour specific VSX instructions.


Repository:
  rL LLVM

http://reviews.llvm.org/D19825





More information about the llvm-commits mailing list