[llvm] [X86] Improve variable 8-bit shifts on AVX512BW (PR #164136)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 19 05:30:37 PDT 2025


================
@@ -30968,6 +30968,76 @@ static SDValue LowerShift(SDValue Op, const X86Subtarget &Subtarget,
     return DAG.getNode(X86ISD::PACKUS, dl, VT, LoR, HiR);
   }
 
+  if (VT == MVT::v64i8 && Subtarget.canExtendTo512BW()) {
+    // On AVX512BW, we can use variable 16-bit shifts to implement variable
+    // 8-bit shifts. For this, we split the input into two vectors, RLo and RHi.
+    // The i-th lane of RLo contains the (2*i)-th lane of R, and the i-th lane
+    // of RHi contains the (2*i+1)-th lane of R. After shifting, these vectors
+    // can efficiently be merged together using a masked move.
+    MVT ExtVT = MVT::v32i16;
+
+    // When used in a vectorshuffle, selects even-index lanes from the first
+    // vector and odd index lanes from the second vector.
+    SmallVector<int, 64> InterleaveIndices;
+    for (unsigned i = 0; i < 64; ++i) {
+      unsigned offset = (i % 2 == 0) ? 0 : 64;
+      InterleaveIndices.push_back(i + offset);
+    }
+
+    SDValue zero = DAG.getConstant(0, dl, VT);
+    SDValue eight = DAG.getTargetConstant(8, dl, MVT::i8);
+    SDValue RLo, RHi;
+
+    // Isolate lower and upper lanes of Amt by shuffling zeros into AmtLo and
+    // right shifting AmtHi.
+    SDValue AmtLo = DAG.getBitcast(
+        ExtVT, DAG.getVectorShuffle(VT, dl, Amt, zero, InterleaveIndices));
----------------
RKSimon wrote:

Isn't this just an AND?
```suggestion
SDValue AmtLo = DAG.getNode(ISD::AND, dl, ExtVT, DAG.getBitcast(Amt, ExtVT), DAG.getConstant(255, dl, ExtVT);
```

https://github.com/llvm/llvm-project/pull/164136


More information about the llvm-commits mailing list