[llvm] [X86] Improve variable 8-bit shifts on AVX512BW (PR #164136)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Sun Oct 19 05:30:37 PDT 2025
================
@@ -30968,6 +30968,76 @@ static SDValue LowerShift(SDValue Op, const X86Subtarget &Subtarget,
return DAG.getNode(X86ISD::PACKUS, dl, VT, LoR, HiR);
}
+ if (VT == MVT::v64i8 && Subtarget.canExtendTo512BW()) {
+ // On AVX512BW, we can use variable 16-bit shifts to implement variable
+ // 8-bit shifts. For this, we split the input into two vectors, RLo and RHi.
+ // The i-th lane of RLo contains the (2*i)-th lane of R, and the i-th lane
+ // of RHi contains the (2*i+1)-th lane of R. After shifting, these vectors
+ // can efficiently be merged together using a masked move.
+ MVT ExtVT = MVT::v32i16;
+
+ // When used in a vectorshuffle, selects even-index lanes from the first
+ // vector and odd index lanes from the second vector.
+ SmallVector<int, 64> InterleaveIndices;
+ for (unsigned i = 0; i < 64; ++i) {
+ unsigned offset = (i % 2 == 0) ? 0 : 64;
+ InterleaveIndices.push_back(i + offset);
+ }
+
+ SDValue zero = DAG.getConstant(0, dl, VT);
+ SDValue eight = DAG.getTargetConstant(8, dl, MVT::i8);
+ SDValue RLo, RHi;
+
+ // Isolate lower and upper lanes of Amt by shuffling zeros into AmtLo and
+ // right shifting AmtHi.
+ SDValue AmtLo = DAG.getBitcast(
+ ExtVT, DAG.getVectorShuffle(VT, dl, Amt, zero, InterleaveIndices));
----------------
RKSimon wrote:
Isn't this just an AND?
```suggestion
SDValue AmtLo = DAG.getNode(ISD::AND, dl, ExtVT, DAG.getBitcast(Amt, ExtVT), DAG.getConstant(255, dl, ExtVT);
```
https://github.com/llvm/llvm-project/pull/164136
More information about the llvm-commits
mailing list