[llvm] [WIP][AMDGPU] Fix emitting illegal COPY (PR #131752)

Tue Mar 18 01:10:04 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Pankaj Dwivedi (PankajDwivedi-25)

<details>
<summary>Changes</summary>

Shader kernel calling convention expects the return value to be in SGPR, instruction selection introduced `COPY` which is illegal/wrong here. Instead of a copy, it should insert `readfirstlane`?

> bb.0 (%ir-block.0):
> liveins: $sgpr0, $sgpr1
> %1:sgpr_32 = COPY $sgpr1
> %0:sgpr_32 = COPY $sgpr0
> %3:sreg_32 = S_MOV_B32 2147483647
> %5:vgpr_32 = COPY %0:sgpr_32
> %6:vgpr_32 = COPY %1:sgpr_32
> %4:vgpr_32 = V_BFI_B32_e64 killed %3:sreg_32, %5:vgpr_32, %6:vgpr_32, implicit $exec
> $sgpr0 = COPY %4:vgpr_32
> SI_RETURN_TO_EPILOG $sgpr0


---
Full diff: https://github.com/llvm/llvm-project/pull/131752.diff


1 Files Affected:

- (added) llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll (+25) 


``````````diff

diff --git a/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll b/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll
new file mode 100644
index 0000000000000..6b14a660f580d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; XFAIL: *
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx1100 -verify-machineinstrs | FileCheck %s -check-prefixes=GCN
+
+define amdgpu_ps i32 @s_copysign_f32_bf16(float inreg %mag, bfloat inreg %sign.bf16) {
+  %sign = fpext bfloat %sign.bf16 to float
+  %op = call float @llvm.copysign.f32(float %mag, float %sign)
+  %cast = bitcast float %op to i32
+  ret i32 %cast
+}
+
+; define i32 @s_copysign_f32_bf16(float %mag, bfloat %sign.bf16) {
+;   %sign = fpext bfloat %sign.bf16 to float
+;   %op = call float @llvm.copysign.f32(float %mag, float %sign)
+;   %cast = bitcast float %op to i32
+;   ret i32 %cast
+; }
+
+; define i32 @s_copysign_f32_bf16(float inreg %mag, bfloat inreg %sign.bf16) {
+;   %sign = fpext bfloat %sign.bf16 to float
+;   %op = call float @llvm.copysign.f32(float %mag, float %sign)
+;   %cast = bitcast float %op to i32
+;   %readlane = call i32 @llvm.amdgcn.readfirstlane(i32 %cast)
+;   ret i32 %readlane
+; }
\ No newline at end of file

``````````

</details>


https://github.com/llvm/llvm-project/pull/131752