[Mlir-commits] [mlir] [AMD][ROCDL][AMDGPU] Support packed conversions fp8/bf8->bf16 and fp8/bf8->fp32 (PR #131850)
Yi Qian
llvmlistbot at llvm.org
Wed Mar 19 11:44:45 PDT 2025
yiqian1 wrote:
> @yiqian1 Please add guards to prevent using the bf16 instructions on gfx942
>
> That is,
>
> ```
> %y = arith.extf %x : f8E4M3FNUZ to bf16
> ```
>
> on gfx942 needs to go via `f32`
An arith.extf to any non-f32 types will go via f32, as shown in these test cases
```
%w = arith.extf %v : f8E5M2FNUZ to f16
```
and
```
%w = arith.extf %v : vector<2xf8E5M2FNUZ> to vector<2xf64>
```
`->bf16` is similar.
Note that currently fp8/bf8->bf16 conversions are not used in lowering amdgpu.ext_packed_fp8. We can add them for gfx950 in the future if we want. In this PR, I just updated amdgpu.ext_packed_fp8 with packed fp8/bf8->fp32 conversions, which are available on both gfx942 and gfx950.
https://github.com/llvm/llvm-project/pull/131850
More information about the Mlir-commits
mailing list