[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc (PR #132383)

Mon Apr 14 08:21:22 PDT 2025

================
@@ -215,8 +207,8 @@ body: |
     ; CHECK: liveins: $sgpr0
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0
-    ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s1) = G_TRUNC [[COPY]](s32)
-    ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:sgpr(s64) = G_ANYEXT [[TRUNC]](s1)
+    ; CHECK-NEXT: [[DEF:%[0-9]+]]:sgpr(s32) = G_IMPLICIT_DEF
+    ; CHECK-NEXT: [[MV:%[0-9]+]]:sgpr(s64) = G_MERGE_VALUES [[COPY]](s32), [[DEF]](s32)
----------------
petar-avramovic wrote:

Not sure if this is correct spot from the original comment but

> Isn't this a correctness regression? I'm not entirely certain because I remember there was some weirdness around what G_TRUNC means semantically. Can you explain why there is no need for a trunc or bitwise and or something like that in the output?

G_TRUNC and G_ANYEXT are no-op with the exception when one operand is vcc. Here we have uniform S1 - trunc + anyext is no-op.
Trunc to vcc is clear high bits, then compare
Anyext from vcc is select

> Note that anyext_s1_to_s32_vgpr does leave a G_AND, so either that test shows a code quality issue or this test is incorrect.

anyext_s1_to_s32_vgpr we need to lower vgpr trunc to vcc. And is from clearing high bits for icmp

https://github.com/llvm/llvm-project/pull/132383