[Mlir-commits] [mlir] [mlir][amdgpu] Add scaled_ext_packed{8, 16} operations (PR #159830)
Erick Ochoa Lopez
llvmlistbot at llvm.org
Fri Oct 17 08:57:22 PDT 2025
================
@@ -150,10 +150,50 @@ def AMDGPU_ScaledExtPacked816Op
When the block size is 32, `firstScaleByte` can be either 0 or 2,
selecting halves of the scale vectors. Lanes 0-15 will read from
`firstScaleByte` and lanes 16-31 will read from `firstScaleByte` + 1.
+ For example:
+ ```mlir
+ // Input: 8-element vector of F8E4M3FN, converting to F32
+ // Lanes 0-15 read from byte 0, lanes 16-31 read from byte 1
+ %result = amdgpu.scaled_ext_packed816 %source
+ scale(%scales)
+ blockSize(32)
+ firstScaleLane(0)
+ firstScaleByte(0)
+ : vector<8xf8E4M3FN>, vector<4xf8E8M0FNU> -> vector<8xf32>
----------------
amd-eochoalo wrote:
https://github.com/llvm/llvm-project/pull/159830/commits/7a5fea59e71551863c5ed4bc573f10a430fcf8bb
https://github.com/llvm/llvm-project/pull/159830
More information about the Mlir-commits
mailing list