[all-commits] [llvm/llvm-project] b0e9b0: [NVPTX] Make nvptx mma instructions convergent. (#...

Mon Jun 24 19:16:19 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b0e9b00ce7d623175c5e60e82afe24e7f8a200be
      https://github.com/llvm/llvm-project/commit/b0e9b00ce7d623175c5e60e82afe24e7f8a200be
  Author: weiwei chen <weiwei.chen at modular.com>
  Date:   2024-06-24 (Mon, 24 Jun 2024)

  Changed paths:
    M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
    A llvm/test/CodeGen/NVPTX/mma-no-sink-after-laneid-check.ll

  Log Message:
  -----------
  [NVPTX] Make nvptx mma instructions convergent. (#96521)

We are running into NVPTX backend generating wrong code for an input:
```
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
if laneid == 0:
  ret
else:
  store %0
```

The backend reorder the instruction (as an effect of `MachineSink` pass)
to
```
if laneid == 0:
  ret
else:
  %0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
  store %0
```

This is incorrect because `mma` is a warp instruction which needs all
threads to sync before performing the operation instead of being guarded
by a specific thread id. It should be similar as the shuffle instruction
`shfl` in terms of warp level sync, and `shfl` is marked as
`isConvergent = true`.

Apply `isConvergent = true` to `mma` instructions.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications