[llvm] [AMDGPU] Add new llvm.amdgcn.wave.shuffle intrinsic (PR #167372)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 20 01:52:38 PST 2025


================
@@ -7280,6 +7280,85 @@ static SDValue lowerLaneOp(const SITargetLowering &TLI, SDNode *N,
   return DAG.getBitcast(VT, UnrolledLaneOp);
 }
 
+static SDValue lowerWaveShuffle(const SITargetLowering &TLI, SDNode *N,
+                                    SelectionDAG &DAG) {
+  EVT VT = N->getValueType(0);
+  unsigned ValSize = VT.getSizeInBits();
+  assert(ValSize == 32);
+  SDLoc SL(N);
+
+  SDValue Value = N->getOperand(1);
+  SDValue Index = N->getOperand(2);
+
+  // ds_bpermute requires index to be multiplied by 4
+  SDValue ShiftAmount = DAG.getShiftAmountConstant(2, MVT::i32, SL);
+  SDValue ShiftedIndex = DAG.getNode(ISD::SHL, SL, Index.getValueType(), Index,
+                                   ShiftAmount);
+
+  // Intrinsics will require i32 to operate on
+  SDValue ValueI32 = Value;
+  if (VT.isFloatingPoint())
----------------
jayfoad wrote:

You could call getBitcast unconditionally here, like you do at the end of this function.

https://github.com/llvm/llvm-project/pull/167372


More information about the llvm-commits mailing list