[llvm] [LLVM][NVPTX] Remove nonexistent ftz ops (PR #106100)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 26 13:09:30 PDT 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-nvptx
Author: Billy Zhu (zyx-billy)
<details>
<summary>Changes</summary>
According to the PTX [spec](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-max), max & min instructions do not support the `ftz` modifier for `bf16` & `bf16x2` types. This PR removes them from instr info, and the non-ftz legal versions will be emitted instead.
---
Full diff: https://github.com/llvm/llvm-project/pull/106100.diff
1 Files Affected:
- (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.td (-13)
``````````diff
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index a6dfb704e38d2e..b7e210805db904 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -334,25 +334,12 @@ multiclass FMINIMUMMAXIMUM<string OpcStr, bit NaN, SDNode OpNode> {
!strconcat(OpcStr, ".f16x2 \t$dst, $a, $b;"),
[(set Int32Regs:$dst, (OpNode (v2f16 Int32Regs:$a), (v2f16 Int32Regs:$b)))]>,
Requires<[useFP16Math, hasSM<80>, hasPTX<70>]>;
- def bf16rr_ftz :
- NVPTXInst<(outs Int16Regs:$dst),
- (ins Int16Regs:$a, Int16Regs:$b),
- !strconcat(OpcStr, ".ftz.bf16 \t$dst, $a, $b;"),
- [(set Int16Regs:$dst, (OpNode (bf16 Int16Regs:$a), (bf16 Int16Regs:$b)))]>,
- Requires<[hasBF16Math, doF32FTZ, hasSM<80>, hasPTX<70>]>;
def bf16rr :
NVPTXInst<(outs Int16Regs:$dst),
(ins Int16Regs:$a, Int16Regs:$b),
!strconcat(OpcStr, ".bf16 \t$dst, $a, $b;"),
[(set Int16Regs:$dst, (OpNode (bf16 Int16Regs:$a), (bf16 Int16Regs:$b)))]>,
Requires<[hasBF16Math, hasSM<80>, hasPTX<70>]>;
-
- def bf16x2rr_ftz :
- NVPTXInst<(outs Int32Regs:$dst),
- (ins Int32Regs:$a, Int32Regs:$b),
- !strconcat(OpcStr, ".ftz.bf16x2 \t$dst, $a, $b;"),
- [(set Int32Regs:$dst, (OpNode (v2bf16 Int32Regs:$a), (v2bf16 Int32Regs:$b)))]>,
- Requires<[hasBF16Math, hasSM<80>, hasPTX<70>, doF32FTZ]>;
def bf16x2rr :
NVPTXInst<(outs Int32Regs:$dst),
(ins Int32Regs:$a, Int32Regs:$b),
``````````
</details>
https://github.com/llvm/llvm-project/pull/106100
More information about the llvm-commits
mailing list