[llvm] AMDGPU/GlobalISel: Regbanklegalize rules for G_FSQRT (PR #179817)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 5 06:15:49 PST 2026
================
@@ -1,40 +1,57 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-fast | FileCheck %s
-# RUN: llc -mtriple=amdgcn -run-pass=regbankselect %s -verify-machineinstrs -o - -regbankselect-greedy | FileCheck %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx900 -run-pass='amdgpu-regbankselect,amdgpu-regbanklegalize' %s -o - | FileCheck -check-prefixes=GCN,GFX9 %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -run-pass='amdgpu-regbankselect,amdgpu-regbanklegalize' %s -o - | FileCheck -check-prefixes=GCN,GFX12 %s
---
-name: fsqrt_s
+name: fsqrt_s16_uniform
legalized: true
body: |
bb.0:
- liveins: $sgpr0_sgpr1
- ; CHECK-LABEL: name: fsqrt_s
- ; CHECK: liveins: $sgpr0_sgpr1
- ; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
- ; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY [[COPY]](s32)
- ; CHECK-NEXT: [[FSQRT:%[0-9]+]]:vgpr(s32) = G_FSQRT [[COPY1]]
- ; CHECK-NEXT: $vgpr0 = COPY [[FSQRT]](s32)
+ liveins: $sgpr0
+ ; GFX9-LABEL: name: fsqrt_s16_uniform
+ ; GFX9: liveins: $sgpr0
+ ; GFX9-NEXT: {{ $}}
+ ; GFX9-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
+ ; GFX9-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr(s16) = COPY [[TRUNC]](s16)
+ ; GFX9-NEXT: [[FSQRT:%[0-9]+]]:vgpr(s16) = G_FSQRT [[COPY1]]
+ ; GFX9-NEXT: [[ANYEXT:%[0-9]+]]:vgpr(s32) = G_ANYEXT [[FSQRT]](s16)
+ ; GFX9-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
+ ;
+ ; GFX12-LABEL: name: fsqrt_s16_uniform
+ ; GFX12: liveins: $sgpr0
+ ; GFX12-NEXT: {{ $}}
+ ; GFX12-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
+ ; GFX12-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s16) = G_TRUNC [[COPY]](s32)
+ ; GFX12-NEXT: [[FSQRT:%[0-9]+]]:sgpr(s16) = G_FSQRT [[TRUNC]]
+ ; GFX12-NEXT: [[ANYEXT:%[0-9]+]]:sgpr(s32) = G_ANYEXT [[FSQRT]](s16)
+ ; GFX12-NEXT: $vgpr0 = COPY [[ANYEXT]](s32)
%0:_(s32) = COPY $sgpr0
- %1:_(s32) = G_FSQRT %0
- $vgpr0 = COPY %1
+ %1:_(s16) = G_TRUNC %0
+ %2:_(s16) = G_FSQRT %1
+ %3:_(s32) = G_ANYEXT %2
+ $vgpr0 = COPY %3
...
---
-name: fsqrt_v
+name: fsqrt_s16_divergent
----------------
arsenm wrote:
v / VGPR and s / SGPR are more accurate naming convention. The uniformity is a proxy for the register bank
https://github.com/llvm/llvm-project/pull/179817
More information about the llvm-commits
mailing list