[all-commits] [llvm/llvm-project] 2f627c: [NVPTX] Support for dense and sparse MMA intrinsic...
Kirill Vedernikov via All-commits
all-commits at lists.llvm.org
Fri Nov 21 04:14:14 PST 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 2f627c1878a3dba594c872773107c556992af3a1
https://github.com/llvm/llvm-project/commit/2f627c1878a3dba594c872773107c556992af3a1
Author: Kirill Vedernikov <kvedernikov at nvidia.com>
Date: 2025-11-21 (Fri, 21 Nov 2025)
Changed paths:
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
A llvm/test/CodeGen/NVPTX/wmma-ptx88-sm120a.py
M llvm/test/CodeGen/NVPTX/wmma.py
Log Message:
-----------
[NVPTX] Support for dense and sparse MMA intrinsics with block scaling. (#163561)
This change adds dense and sparse MMA intrinsics with block scaling. The
implementation is based on [PTX ISA version
9.0](https://docs.nvidia.com/cuda/parallel-thread-execution/). Tests for
new intrinsics are added for PTX 8.7 and SM 120a and are generated by
`llvm/test/CodeGen/NVPTX/wmma-ptx87-sm120a.py`. The tests have been
verified with ptxas from CUDA-13.0 release.
Dense MMA intrinsics with block scaling were supported by
@schwarzschild-radius.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list