[clang] 0ad19a8 - [CUDA, NVPTX] Corrected fragment size for tf32 LD B matrix.
Artem Belevich via cfe-commits
cfe-commits at lists.llvm.org
Tue Jan 25 11:31:42 PST 2022
Author: JackAKirk
Date: 2022-01-25T11:29:19-08:00
New Revision: 0ad19a833177861be55fefaff725ab89c8695d01
URL: https://github.com/llvm/llvm-project/commit/0ad19a833177861be55fefaff725ab89c8695d01
DIFF: https://github.com/llvm/llvm-project/commit/0ad19a833177861be55fefaff725ab89c8695d01.diff
LOG: [CUDA,NVPTX] Corrected fragment size for tf32 LD B matrix.
Signed-off-by: JackAKirk <jack.kirk at codeplay.com>
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D118023
Added:
Modified:
clang/lib/CodeGen/CGBuiltin.cpp
Removed:
################################################################################
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index cd35e7cbe76f7..a80a55e054a3b 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -17190,7 +17190,7 @@ static NVPTXMmaLdstInfo getNVPTXMmaLdstInfo(unsigned BuiltinID) {
case NVPTX::BI__mma_tf32_m16n16k8_ld_a:
return MMA_LDST(4, m16n16k8_load_a_tf32);
case NVPTX::BI__mma_tf32_m16n16k8_ld_b:
- return MMA_LDST(2, m16n16k8_load_b_tf32);
+ return MMA_LDST(4, m16n16k8_load_b_tf32);
case NVPTX::BI__mma_tf32_m16n16k8_ld_c:
return MMA_LDST(8, m16n16k8_load_c_f32);
More information about the cfe-commits
mailing list