[PATCH] D38671: lowering shuffle i/f intrinsic - llvm part

Sun Oct 8 01:27:11 PDT 2017

craig.topper added inline comments.

================
Comment at: lib/IR/AutoUpgrade.cpp:1284
+      for (unsigned i = 0; i != NumElts; ++i) {
+        // Base index is the starting element of the lane.
+        Idxs[i] = i ;
----------------
Can you write this more like this code? I think its clearer.

```
void decodeVSHUF64x2FamilyMask(MVT VT, unsigned Imm,
                        SmallVectorImpl<int> &ShuffleMask) {
  unsigned NumLanes = VT.getSizeInBits() / 128;
  unsigned NumElementsInLane = 128 / VT.getScalarSizeInBits();
  unsigned ControlBitsMask = NumLanes - 1;
  unsigned NumControlBits  = NumLanes / 2;

  for (unsigned l = 0; l != NumLanes; ++l) {
    unsigned LaneMask = (Imm >> (l * NumControlBits)) & ControlBitsMask;
    // We actually need the other source.
    if (l >= NumLanes / 2)
      LaneMask += NumLanes;
    for (unsigned i = 0; i != NumElementsInLane; ++i)
      ShuffleMask.push_back(LaneMask * NumElementsInLane + i);
  }
}
```

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29084

+  // Attempt to use SHUF128 for masked X86ISD::VPERM2X128
+  // from this paterrn:
----------------
You should do this in combineBitcastForMaskedOp. It was already created to fix shuffles to work with masking.

https://reviews.llvm.org/D38671