[Mlir-commits] [mlir] [mlir][x86vector] AVX Convert/Broadcast F16 to F32 instructions (PR #137917)

Wed Apr 30 09:29:46 PDT 2025

================
@@ -408,24 +408,27 @@ def DotOp : AVX_LowOp<"dot", [Pure,
   }];
 }
 
-
 //----------------------------------------------------------------------------//
-// AVX: Convert packed BF16 even-indexed/odd-indexed elements into packed F32
+// AVX: Convert BF16/F16 to F32 and broadcast into packed F32
 //----------------------------------------------------------------------------//
 
-def CvtPackedEvenIndexedBF16ToF32Op : AVX_Op<"cvt.packed.even.indexed.bf16_to_f32", [MemoryEffects<[MemRead]>, 
+def BcstToPackedF32Op : AVX_Op<"bcst_to_f32.packed", [MemoryEffects<[MemRead]>,
   DeclareOpInterfaceMethods<OneToOneIntrinsicOpInterface>]> {
-  let summary = "AVX: Convert packed BF16 even-indexed elements into packed F32 Data.";
+  let summary = "AVX: Broadcasts BF16/F16 into packed F32 Data.";
   let description = [{
     #### From the Intel Intrinsics Guide:
 
-    Convert packed BF16 (16-bit) floating-point even-indexed elements stored at
-    memory locations starting at location `__A` to packed single-precision
-    (32-bit) floating-point elements, and store the results in `dst`.
+    Convert scalar BF16 or F16 (16-bit) floating-point element stored at memory locations
+    starting at location `__A` to a single-precision (32-bit) floating-point,
+    broadcast it to packed single-precision (32-bit) floating-point elements,
+    and store the results in `dst`.
 
     Example:
     ```mlir
-    %dst = x86vector.avx.cvt.packed.even.indexed.bf16_to_f32 %a : memref<16xbf16> -> vector<8xf32>
+    %dst = x86vector.avx.bcst_to_f32.packed %a : memref<1xbf16> -> vector<8xf32>
+    ```
+    ```mlir
----------------
adam-smnk wrote:

nit: you can merge the two code blocks

https://github.com/llvm/llvm-project/pull/137917