[llvm] GlobalISel needs fdiv 1 / sqrt(x) to rsq combine (PR #78673)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 30 00:08:23 PST 2024
================
@@ -0,0 +1,186 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -run-pass=amdgpu-postlegalizer-combiner -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s
+
+---
+name: rsq_f16
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0
+
+ ; GCN-LABEL: name: rsq_f16
+ ; GCN: liveins: $vgpr0
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
+ ; GCN-NEXT: %x:_(s16) = G_TRUNC [[COPY]](s32)
+ ; GCN-NEXT: %three:_(s16) = G_FCONSTANT half 0xH4200
+ ; GCN-NEXT: [[INT:%[0-9]+]]:_(s16) = contract G_INTRINSIC intrinsic(@llvm.amdgcn.rsq), %x(s16)
+ ; GCN-NEXT: %rsq:_(s16) = contract G_FMUL [[INT]], %three
+ ; GCN-NEXT: %ext:_(s32) = G_ANYEXT %rsq(s16)
+ ; GCN-NEXT: $vgpr0 = COPY %ext(s32)
+ %0:_(s32) = COPY $vgpr0
+ %x:_(s16) = G_TRUNC %0:_(s32)
+ %sqrt:_(s16) = contract G_FSQRT %x
+ %three:_(s16) = G_FCONSTANT half 3.0
+ %rsq:_(s16) = contract G_FDIV %three, %sqrt
----------------
Pierre-vh wrote:
Almost done, just needs a bit more testing
It needs some tests for 1/sqrt and -1/sqrt (ideally for every type too) so we can see what happens in those case, and it might be nice to also have a bit more variance in the numerators - try some values like +-0.5, or bigger numbers like 10?
I'm wondering if 0.5 gets converted to rsq/2
You can just change all these tests to use + or -1, and then add the few extra tests with different numerators at the bottom, no need to have those for every type I think
https://github.com/llvm/llvm-project/pull/78673
More information about the llvm-commits
mailing list