[PATCH] D114765: [X86][FP16] Only generate approximate rsqrt when Reciprocal is true for half type
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 29 18:37:38 PST 2021
pengfei created this revision.
pengfei added reviewers: craig.topper, LuoYuanke, RKSimon, spatel, LiuChen3.
Herald added a subscriber: hiraditya.
pengfei requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
We have reasonable fast sqrt and accurate rsqrt for half type due to the
limited fractions. So neither do we need multi steps refinement for
rsqrt nor replace sqrt by rsqrt.
This fixes a correctness issue when `RefinementSteps` = 0.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D114765
Files:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/avx512fp16vl-intrinsics.ll
Index: llvm/test/CodeGen/X86/avx512fp16vl-intrinsics.ll
===================================================================
--- llvm/test/CodeGen/X86/avx512fp16vl-intrinsics.ll
+++ llvm/test/CodeGen/X86/avx512fp16vl-intrinsics.ll
@@ -969,6 +969,15 @@
ret <8 x half> %2
}
+define <8 x half> @test_sqrt_ph_128_fast2(<8 x half> %a0, <8 x half> %a1) {
+; CHECK-LABEL: test_sqrt_ph_128_fast2:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsqrtph %xmm0, %xmm0
+; CHECK-NEXT: retq
+ %1 = call fast <8 x half> @llvm.sqrt.v8f16(<8 x half> %a0)
+ ret <8 x half> %1
+}
+
define <8 x half> @test_mask_sqrt_ph_128(<8 x half> %a0, <8 x half> %passthru, i8 %mask) {
; CHECK-LABEL: test_mask_sqrt_ph_128:
; CHECK: # %bb.0:
Index: llvm/lib/Target/X86/X86ISelLowering.cpp
===================================================================
--- llvm/lib/Target/X86/X86ISelLowering.cpp
+++ llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -23232,7 +23232,7 @@
}
if (VT.getScalarType() == MVT::f16 && isTypeLegal(VT) &&
- Subtarget.hasFP16()) {
+ Subtarget.hasFP16() && Reciprocal) {
if (RefinementSteps == ReciprocalEstimate::Unspecified)
RefinementSteps = 0;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D114765.390543.patch
Type: text/x-patch
Size: 1184 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211130/cf295b88/attachment.bin>
More information about the llvm-commits
mailing list