[llvm-branch-commits] [llvm] AMDGPU: Define v_mfma_f32_{16x16x128|32x32x64}_f8f6f4 instructions (PR #116723)
Shilei Tian via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Nov 18 20:12:47 PST 2024
================
@@ -1397,6 +1397,19 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
used by hardware to control active lanes when used in EXEC register.
For example, ballot(i1 true) return EXEC mask.
+ llvm.amdgcn.mfma.f32.16x16x128.f8f6f4.scaled Emit `v_mfma_f32_16x16x128_f8f6f4`, bundled with a `v_mfma_ld_scale_b32`
----------------
shiltian wrote:
This reminds me that we probably didn't add other gfx950 intrinsics to the document.
https://github.com/llvm/llvm-project/pull/116723
More information about the llvm-branch-commits
mailing list