[llvm] GlobalISel needs fdiv 1 / sqrt(x) to rsq combine (PR #78673)

Pierre van Houtryve via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 30 00:08:23 PST 2024


================
@@ -0,0 +1,186 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -run-pass=amdgpu-postlegalizer-combiner -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s
+
+---
+name:            rsq_f16
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $vgpr0
+
+    ; GCN-LABEL: name: rsq_f16
+    ; GCN: liveins: $vgpr0
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
+    ; GCN-NEXT: %x:_(s16) = G_TRUNC [[COPY]](s32)
+    ; GCN-NEXT: %three:_(s16) = G_FCONSTANT half 0xH4200
+    ; GCN-NEXT: [[INT:%[0-9]+]]:_(s16) = contract G_INTRINSIC intrinsic(@llvm.amdgcn.rsq), %x(s16)
+    ; GCN-NEXT: %rsq:_(s16) = contract G_FMUL [[INT]], %three
+    ; GCN-NEXT: %ext:_(s32) = G_ANYEXT %rsq(s16)
+    ; GCN-NEXT: $vgpr0 = COPY %ext(s32)
+    %0:_(s32) = COPY $vgpr0
+    %x:_(s16) = G_TRUNC %0:_(s32)
+    %sqrt:_(s16) = contract G_FSQRT %x
+    %three:_(s16) = G_FCONSTANT half 3.0
+    %rsq:_(s16) = contract G_FDIV %three, %sqrt
----------------
Pierre-vh wrote:

Almost done, just needs a bit more testing

It needs some tests for 1/sqrt and -1/sqrt (ideally for every type too) so we can see what happens in those case, and it might be nice to also have a bit more variance in the numerators - try some values like +-0.5, or bigger numbers like 10?
I'm wondering if 0.5 gets converted to rsq/2

You can just change all these tests to use + or -1, and then add the few extra tests with different numerators at the bottom, no need to have those for every type I think

https://github.com/llvm/llvm-project/pull/78673


More information about the llvm-commits mailing list