[llvm] [NVPTX] Prefer prmt.b32 over bfi.b32 (PR #110766)
Justin Fargnoli via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 4 12:54:03 PDT 2024
justinfargnoli wrote:
> According to https://arxiv.org/pdf/2208.11174 BFI is much more expensive than PRMT which appears to take just 1 cycle on A100:
I think this refers to `bfi` in the general case. When the `c` and `d` operands are multiples of 8, `bfi` can be run as a `prmt`.
https://github.com/llvm/llvm-project/pull/110766
More information about the llvm-commits
mailing list