[llvm] [NVPTX] Add patterns for fma.relu.{f16|bf16} (PR #114977)

Hugh Delaney via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 7 02:43:50 PST 2024


hdelan wrote:

> We will eventually want to figure out a way to effectively use f16x2 and bf16x2 variants of `fma.rn.relu`.

These should be available via the LLVM intrinsics. I think pattern matching to generate the x2 variants is a bit messy and fragile since it involves unpacking vector types. There also isn't really a concise way to express this pattern in source code, meaning the use cases for this sort of substitution are likely to be rare. Users needing this instruction for x2 types can always do so via the clang builtins.

https://github.com/llvm/llvm-project/pull/114977


More information about the llvm-commits mailing list