[PATCH] D110076: [AMDGPU][GlobalISel] Code quality: Combine V_RSQ
Mateja Marjanovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 20 08:16:41 PDT 2021
matejam created this revision.
matejam added reviewers: foad, arsenm, Petar.Avramovic, mbrkusanin.
matejam added a project: LLVM.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, rovka, yaxunl, nhaehnle, jvesely, kzhuravl.
matejam requested review of this revision.
Herald added a subscriber: wdng.
Combine V_RCP and V_SQRT into V_RSQ on GlobalISel.
This combiner already existed but was used only when unsafe-fp-math is on, unlike SDag.
https://reviews.llvm.org/D110076
Files:
llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
llvm/lib/Target/AMDGPU/SIInstructions.td
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.mir
Index: llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.mir
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.mir
@@ -0,0 +1,29 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -global-isel -march=amdgcn -mcpu=gfx1010 -run-pass=instruction-select -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name: test
+alignment: 1
+legalized: true
+regBankSelected: true
+selected: false
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $sgpr0
+
+ ; CHECK-LABEL: name: test
+ ; CHECK: liveins: $sgpr0
+ ; CHECK: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
+ ; CHECK: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]]
+ ; CHECK: %3:vgpr_32 = afn nofpexcept V_RSQ_F32_e32 [[COPY1]], implicit $mode, implicit $exec
+ ; CHECK: $vgpr0 = COPY %3
+ ; CHECK: SI_RETURN_TO_EPILOG implicit $vgpr0
+ %0:sgpr(s32) = COPY $sgpr0
+ %4:vgpr(s32) = COPY %0(s32)
+ %2:vgpr(s32) = afn G_FSQRT %4
+ %3:vgpr(s32) = afn G_INTRINSIC intrinsic(@llvm.amdgcn.rcp), %2(s32)
+ $vgpr0 = COPY %3(s32)
+ SI_RETURN_TO_EPILOG implicit $vgpr0
+
+...
Index: llvm/lib/Target/AMDGPU/SIInstructions.td
===================================================================
--- llvm/lib/Target/AMDGPU/SIInstructions.td
+++ llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -825,12 +825,10 @@
// VOP1 Patterns
//===----------------------------------------------------------------------===//
-let OtherPredicates = [UnsafeFPMath] in {
-
-//defm : RsqPat<V_RSQ_F32_e32, f32>;
-
def : RsqPat<V_RSQ_F32_e32, f32>;
+let OtherPredicates = [UnsafeFPMath] in {
+
// Convert (x - floor(x)) to fract(x)
def : GCNPat <
(f32 (fsub (f32 (VOP3Mods f32:$x, i32:$mods)),
Index: llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
+++ llvm/lib/Target/AMDGPU/AMDGPUInstructions.td
@@ -658,7 +658,7 @@
class RsqPat<Instruction RsqInst, ValueType vt> : AMDGPUPat <
(AMDGPUrcp (fsqrt vt:$src)),
- (RsqInst $src)
+ (RsqInst vt:$src)
>;
// Instructions which select to the same v_min_f*
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D110076.373594.patch
Type: text/x-patch
Size: 2248 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210920/eecaf992/attachment.bin>
More information about the llvm-commits
mailing list