[PATCH] NVPTX: support f64 <-> f16 intrinsics
Tim Northover
t.p.northover at gmail.com
Thu Jul 17 05:41:13 PDT 2014
Hi,
I'm in the process of reworking how we handle the __fp16 type slightly. I have larger goals, but the most important immediate one is to perform extensions and truncations in one step so that this C code has IEEE-sensible semantics:
void my_round(double in, __fp16 *out) { *out = in; }
Now, I *think* this is fairly academic as far as OpenCL is concerned (you have to use the vload_half/vstore_half functions to access __fp16 at all times), but I'd like to minimise breakage as far as possible anyway.
As part of this I've made the @llvm.convert.from.fp16 and @llvm.convert.to.fp16 intrinsics polymorphic, and would like to add support for f64 variants in as many places as possible.
NVPTX already seemed to have the instructions there, waiting to be used so I added a couple of patterns and a test.
Are you happy for me to commit the change?
Cheers.
Tim.
http://reviews.llvm.org/D4558
Files:
lib/Target/NVPTX/NVPTXIntrinsics.td
test/CodeGen/NVPTX/fp16.ll
Index: lib/Target/NVPTX/NVPTXIntrinsics.td
===================================================================
--- lib/Target/NVPTX/NVPTXIntrinsics.td
+++ lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -799,6 +799,11 @@
def : Pat<(i16 (fp_to_f16 Float32Regs:$a)),
(CVT_f16_f32 Float32Regs:$a, CvtRN)>;
+def : Pat<(f64 (f16_to_fp Int16Regs:$a)),
+ (CVT_f64_f16 Int16Regs:$a, CvtNONE)>;
+def : Pat<(i16 (fp_to_f16 Float64Regs:$a)),
+ (CVT_f16_f64 Float64Regs:$a, CvtRN)>;
+
//
// Bitcast
//
Index: test/CodeGen/NVPTX/fp16.ll
===================================================================
--- /dev/null
+++ test/CodeGen/NVPTX/fp16.ll
@@ -0,0 +1,45 @@
+; RUN: llc -march=nvptx -verify-machineinstrs < %s | FileCheck %s
+
+declare float @llvm.convert.from.fp16.f32(i16) nounwind readnone
+declare double @llvm.convert.from.fp16.f64(i16) nounwind readnone
+declare i16 @llvm.convert.to.fp16.f32(float) nounwind readnone
+declare i16 @llvm.convert.to.fp16.f64(double) nounwind readnone
+
+; CHECK-LABEL: @test_convert_fp16_to_fp32
+; CHECK: cvt.f32.f16
+define void @test_convert_fp16_to_fp32(float addrspace(1)* noalias %out, i16 addrspace(1)* noalias %in) nounwind {
+ %val = load i16 addrspace(1)* %in, align 2
+ %cvt = call float @llvm.convert.from.fp16.f32(i16 %val) nounwind readnone
+ store float %cvt, float addrspace(1)* %out, align 4
+ ret void
+}
+
+
+; CHECK-LABEL: @test_convert_fp16_to_fp64
+; CHECK: cvt.f64.f16
+define void @test_convert_fp16_to_fp64(double addrspace(1)* noalias %out, i16 addrspace(1)* noalias %in) nounwind {
+ %val = load i16 addrspace(1)* %in, align 2
+ %cvt = call double @llvm.convert.from.fp16.f64(i16 %val) nounwind readnone
+ store double %cvt, double addrspace(1)* %out, align 4
+ ret void
+}
+
+
+; CHECK-LABEL: @test_convert_fp32_to_fp16
+; CHECK: cvt.rn.f16.f32
+define void @test_convert_fp32_to_fp16(i16 addrspace(1)* noalias %out, float addrspace(1)* noalias %in) nounwind {
+ %val = load float addrspace(1)* %in, align 2
+ %cvt = call i16 @llvm.convert.to.fp16.f32(float %val) nounwind readnone
+ store i16 %cvt, i16 addrspace(1)* %out, align 4
+ ret void
+}
+
+
+; CHECK-LABEL: @test_convert_fp64_to_fp16
+; CHECK: cvt.rn.f16.f64
+define void @test_convert_fp64_to_fp16(i16 addrspace(1)* noalias %out, double addrspace(1)* noalias %in) nounwind {
+ %val = load double addrspace(1)* %in, align 2
+ %cvt = call i16 @llvm.convert.to.fp16.f64(double %val) nounwind readnone
+ store i16 %cvt, i16 addrspace(1)* %out, align 4
+ ret void
+}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4558.11573.patch
Type: text/x-patch
Size: 2526 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140717/9ddd6359/attachment.bin>
More information about the llvm-commits
mailing list