[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc (PR #132383)
Petar Avramovic via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Apr 14 08:21:22 PDT 2025
================
@@ -215,8 +207,8 @@ body: |
; CHECK: liveins: $sgpr0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
- ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s1) = G_TRUNC [[COPY]](s32)
- ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:sgpr(s64) = G_ANYEXT [[TRUNC]](s1)
+ ; CHECK-NEXT: [[DEF:%[0-9]+]]:sgpr(s32) = G_IMPLICIT_DEF
+ ; CHECK-NEXT: [[MV:%[0-9]+]]:sgpr(s64) = G_MERGE_VALUES [[COPY]](s32), [[DEF]](s32)
----------------
petar-avramovic wrote:
Not sure if this is correct spot from the original comment but
> Isn't this a correctness regression? I'm not entirely certain because I remember there was some weirdness around what G_TRUNC means semantically. Can you explain why there is no need for a trunc or bitwise and or something like that in the output?
G_TRUNC and G_ANYEXT are no-op with the exception when one operand is vcc. Here we have uniform S1 - trunc + anyext is no-op.
Trunc to vcc is clear high bits, then compare
Anyext from vcc is select
> Note that anyext_s1_to_s32_vgpr does leave a G_AND, so either that test shows a code quality issue or this test is incorrect.
anyext_s1_to_s32_vgpr we need to lower vgpr trunc to vcc. And is from clearing high bits for icmp
https://github.com/llvm/llvm-project/pull/132383
More information about the llvm-branch-commits
mailing list