[PATCH] D34160: [Power9] Exploit vinserth instruction

Nemanja Ivanovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 23 06:17:49 PDT 2017


nemanjai added inline comments.


================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:7910
+      // to our expected order
+      if (MaskOneElt == VINSERTHSrcElem &&
+          (Mask & MaskOtherElts) == (TargetOrder & MaskOtherElts)) {
----------------
gyiu wrote:
> nemanjai wrote:
> > So we don't want to shift if we're within the same register? Is there a specific reason for this?
> I believe we have to add a xxlor to another VR if we want to shift the vector since we can't shift if both operands of the vector shuffle are the same vector.  Adding another two cycles to VECSHL+VECINSERT seems diminish its value versus load+vperm.
I don't really see why. Assume that you have something like this:

```
vector unsigned short test(vector unsigned short a) {
  a[5] = a[2];
}
```
I don't see why we can't codegen something like this for it:
```
vsldoi 3, 2, 2, 4
vinserth 2, 3, 4
```
Forgive me if I didn't work out the immediates exactly correctly, but the point is the [lack of] need for the XXLOR. Of course, this does use an extra register, but so does the alternative (vperm).


================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:7921
+      if ((Mask & MaskOtherElts) == (TargetOrder & MaskOtherElts)) {
+        // we only need the last 3 bits for the number of shifts
+        ShiftElts = IsLE ? LittleEndianShifts[MaskOneElt & 0x7]
----------------
gyiu wrote:
> nemanjai wrote:
> > Isn't this already guaranteed to only have the low order 3 bits set?
> MaskOneElt could actually be >= 8, since the mask is in range [0, 15].
Ah, right. I didn't think of that. Sorry about that.


Repository:
  rL LLVM

https://reviews.llvm.org/D34160





More information about the llvm-commits mailing list