[all-commits] [llvm/llvm-project] b0e9b0: [NVPTX] Make nvptx mma instructions convergent. (#...
weiwei chen via All-commits
all-commits at lists.llvm.org
Mon Jun 24 19:16:19 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: b0e9b00ce7d623175c5e60e82afe24e7f8a200be
https://github.com/llvm/llvm-project/commit/b0e9b00ce7d623175c5e60e82afe24e7f8a200be
Author: weiwei chen <weiwei.chen at modular.com>
Date: 2024-06-24 (Mon, 24 Jun 2024)
Changed paths:
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
A llvm/test/CodeGen/NVPTX/mma-no-sink-after-laneid-check.ll
Log Message:
-----------
[NVPTX] Make nvptx mma instructions convergent. (#96521)
We are running into NVPTX backend generating wrong code for an input:
```
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
if laneid == 0:
ret
else:
store %0
```
The backend reorder the instruction (as an effect of `MachineSink` pass)
to
```
if laneid == 0:
ret
else:
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
store %0
```
This is incorrect because `mma` is a warp instruction which needs all
threads to sync before performing the operation instead of being guarded
by a specific thread id. It should be similar as the shuffle instruction
`shfl` in terms of warp level sync, and `shfl` is marked as
`isConvergent = true`.
Apply `isConvergent = true` to `mma` instructions.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list