[Mlir-commits] [mlir] [AMD][ROCDL][AMDGPU] Support packed conversions fp8/bf8->bf16 and fp8/bf8->fp32 (PR #131850)

Tue Mar 18 10:03:27 PDT 2025

https://github.com/krzysz00 requested changes to this pull request.

Ok, I'm just going to check so it's out there - what's the portability on these?

Second, I object to unconditionally using the packed instructions - it's a waste of a register.

This code should use the packed instructions where there is actually a vector, and then fall back to the scalar ones for the odd final element / for the scalar case.

https://github.com/llvm/llvm-project/pull/131850