[all-commits] [llvm/llvm-project] e0dc4a: [NVPTX] Expose float tys min, max, abs, neg as bui...
Nicolas Miller via All-commits
all-commits at lists.llvm.org
Wed Feb 23 13:57:35 PST 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e0dc4ac28f0080a10a51a4627c880ca795f07ba0
https://github.com/llvm/llvm-project/commit/e0dc4ac28f0080a10a51a4627c880ca795f07ba0
Author: Jakub Chlanda <j.chlanda at gmail.com>
Date: 2022-02-23 (Wed, 23 Feb 2022)
Changed paths:
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
M llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
A llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70-instcombine.ll
A llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70.ll
A llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72.ll
Log Message:
-----------
[NVPTX] Expose float tys min, max, abs, neg as builtins
Adds support for the following builtins:
- abs, neg:
- .bf16,
- .bf16x2
- min, max
- {.ftz}{.NaN}{.xorsign.abs}.f16
- {.ftz}{.NaN}{.xorsign.abs}.f16x2
- {.NaN}{.xorsign.abs}.bf16
- {.NaN}{.xorsign.abs}.bf16x2
- {.ftz}{.NaN}{.xorsign.abs}.f32
Differential Revision: https://reviews.llvm.org/D117887
Commit: be672934ff885255b7e5e393bf4606e9fb8894a0
https://github.com/llvm/llvm-project/commit/be672934ff885255b7e5e393bf4606e9fb8894a0
Author: Jakub Chlanda <j.chlanda at gmail.com>
Date: 2022-02-23 (Wed, 23 Feb 2022)
Changed paths:
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
M llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
A llvm/test/CodeGen/NVPTX/math-intrins-sm53-ptx42.ll
M llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70-instcombine.ll
M llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70.ll
M llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72.ll
Log Message:
-----------
[NVPTX] Add more FMA intriniscs/builtins
This patch adds builtins/intrinsics for the following variants of FMA:
- f16, f16x2
- rn
- rn_ftz
- rn_sat
- rn_ftz_sat
- rn_relu
- rn_ftz_relu
- bf16, bf16x2
- rn
- rn_relu
ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly.
Differential Revision: https://reviews.llvm.org/D118977
Commit: 69a8350c232af17e5c006a0be8fcf7d749a9728e
https://github.com/llvm/llvm-project/commit/69a8350c232af17e5c006a0be8fcf7d749a9728e
Author: Nicolas Miller <nicolas.miller at codeplay.com>
Date: 2022-02-23 (Wed, 23 Feb 2022)
Changed paths:
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
A llvm/test/CodeGen/NVPTX/f16-ex2.ll
Log Message:
-----------
[NVPTX] Add ex2.approx.f16/f16x2 support
his patch adds builtins and intrinsics for the f16 and f16x2 variants of the ex2
instruction.
These two variants were added in PTX7.0, and are supported by sm_75 and above.
Note that this isn't wired with the exp2 llvm intrinsic because the ex2
instruction is only available in its approx variant.
Running ptxas on the assembly generated by the test f16-ex2.ll works as
expected.
Differential Revision: https://reviews.llvm.org/D119157
Compare: https://github.com/llvm/llvm-project/compare/0c1fd90fe082...69a8350c232a
More information about the All-commits
mailing list