[Mlir-commits] [mlir] [mlir][amdgpu] Add scaled_ext_packed{8, 16} operations (PR #159830)

Fri Oct 17 08:57:22 PDT 2025

================
@@ -150,10 +150,50 @@ def AMDGPU_ScaledExtPacked816Op
     When the block size is 32, `firstScaleByte` can be either 0 or 2,
     selecting halves of the scale vectors. Lanes 0-15 will read from
     `firstScaleByte` and lanes 16-31 will read from `firstScaleByte` + 1.
+    For example:
+    ```mlir
+    // Input: 8-element vector of F8E4M3FN, converting to F32
+    // Lanes 0-15 read from byte 0, lanes 16-31 read from byte 1
+    %result = amdgpu.scaled_ext_packed816 %source
+    scale(%scales)
+    blockSize(32)
+    firstScaleLane(0)
+    firstScaleByte(0)
+    : vector<8xf8E4M3FN>, vector<4xf8E8M0FNU> -> vector<8xf32>
----------------
amd-eochoalo wrote:

https://github.com/llvm/llvm-project/pull/159830/commits/7a5fea59e71551863c5ed4bc573f10a430fcf8bb

https://github.com/llvm/llvm-project/pull/159830