[llvm] [NVPTX] Prefer prmt.b32 over bfi.b32 (PR #110766)

Justin Fargnoli via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 4 12:54:03 PDT 2024


justinfargnoli wrote:

> According to https://arxiv.org/pdf/2208.11174 BFI is much more expensive than PRMT which appears to take just 1 cycle on A100:

I think this refers to `bfi` in the general case. When the `c` and `d` operands are multiples of 8, `bfi` can be run as a `prmt`. 

https://github.com/llvm/llvm-project/pull/110766


More information about the llvm-commits mailing list