[llvm] [LLVM][NVPTX] Remove nonexistent ftz ops (PR #106100)
Billy Zhu via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 26 09:47:17 PDT 2024
https://github.com/zyx-billy created https://github.com/llvm/llvm-project/pull/106100
According to the PTX [spec](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-max), max & min instructions do not support the `ftz` modifier for `bf16` & `bf16x2` types. This PR removes them from instr info.
>From 41e04375ef3ef8c878c123555cd60d3cff72ebe9 Mon Sep 17 00:00:00 2001
From: Billy Zhu <billyzhu at modular.com>
Date: Fri, 23 Aug 2024 17:32:50 -0700
Subject: [PATCH] remove nonexistent ops
---
llvm/lib/Target/NVPTX/NVPTXInstrInfo.td | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index a6dfb704e38d2e..b7e210805db904 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -334,25 +334,12 @@ multiclass FMINIMUMMAXIMUM<string OpcStr, bit NaN, SDNode OpNode> {
!strconcat(OpcStr, ".f16x2 \t$dst, $a, $b;"),
[(set Int32Regs:$dst, (OpNode (v2f16 Int32Regs:$a), (v2f16 Int32Regs:$b)))]>,
Requires<[useFP16Math, hasSM<80>, hasPTX<70>]>;
- def bf16rr_ftz :
- NVPTXInst<(outs Int16Regs:$dst),
- (ins Int16Regs:$a, Int16Regs:$b),
- !strconcat(OpcStr, ".ftz.bf16 \t$dst, $a, $b;"),
- [(set Int16Regs:$dst, (OpNode (bf16 Int16Regs:$a), (bf16 Int16Regs:$b)))]>,
- Requires<[hasBF16Math, doF32FTZ, hasSM<80>, hasPTX<70>]>;
def bf16rr :
NVPTXInst<(outs Int16Regs:$dst),
(ins Int16Regs:$a, Int16Regs:$b),
!strconcat(OpcStr, ".bf16 \t$dst, $a, $b;"),
[(set Int16Regs:$dst, (OpNode (bf16 Int16Regs:$a), (bf16 Int16Regs:$b)))]>,
Requires<[hasBF16Math, hasSM<80>, hasPTX<70>]>;
-
- def bf16x2rr_ftz :
- NVPTXInst<(outs Int32Regs:$dst),
- (ins Int32Regs:$a, Int32Regs:$b),
- !strconcat(OpcStr, ".ftz.bf16x2 \t$dst, $a, $b;"),
- [(set Int32Regs:$dst, (OpNode (v2bf16 Int32Regs:$a), (v2bf16 Int32Regs:$b)))]>,
- Requires<[hasBF16Math, hasSM<80>, hasPTX<70>, doF32FTZ]>;
def bf16x2rr :
NVPTXInst<(outs Int32Regs:$dst),
(ins Int32Regs:$a, Int32Regs:$b),
More information about the llvm-commits
mailing list