[Mlir-commits] [mlir] [mlir][AMDGPU] Add gfx950 MFMAs to the amdgpu.mfma op (PR #133553)

Mon Mar 31 01:11:45 PDT 2025

================
@@ -494,19 +496,31 @@ struct SchedBarrierOpLowering : public ConvertOpToLLVMPattern<SchedBarrierOp> {
 /// and LLVM AMDGPU intrinsics convention.
 ///
 /// Specifically:
-/// 1. If `input` is a vector of N bytes, bitcast it to a (N * 8)-bit integer.
-/// 2. If the element type is bfloat16, bitcast it to i16.
+/// 1. If the element type is bfloat16, bitcast it to i16.
+/// 2. If instead we have a more than 64-bit quantity, use a <N / 4 x i32>
+/// instead, which is what the f8f6f4 intrinsics use.
+/// 3. If `input` is a vector of N <= 8 bytes, bitcast it to a (N * 8)-bit
+/// integer.
 static Value convertMFMAVectorOperand(ConversionPatternRewriter &rewriter,
                                       Location loc, Value input) {
   Type inputType = input.getType();
   if (auto vectorType = dyn_cast<VectorType>(inputType)) {
     if (vectorType.getElementType().isBF16())
       return rewriter.create<LLVM::BitcastOp>(
           loc, vectorType.clone(rewriter.getI16Type()), input);
-    if (vectorType.getElementType().isInteger(8)) {
+    if (vectorType.getElementType().isInteger(8) &&
----------------
dhernandez0 wrote:

couldn't this be float? same for the next one. If this only applies to integer types, please could we update the comment to reflect that?

https://github.com/llvm/llvm-project/pull/133553