[llvm] [WIP][AMDGPU] Fix emitting illegal COPY (PR #131752)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 18 01:10:04 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Pankaj Dwivedi (PankajDwivedi-25)
<details>
<summary>Changes</summary>
Shader kernel calling convention expects the return value to be in SGPR, instruction selection introduced `COPY` which is illegal/wrong here. Instead of a copy, it should insert `readfirstlane`?
> bb.0 (%ir-block.0):
> liveins: $sgpr0, $sgpr1
> %1:sgpr_32 = COPY $sgpr1
> %0:sgpr_32 = COPY $sgpr0
> %3:sreg_32 = S_MOV_B32 2147483647
> %5:vgpr_32 = COPY %0:sgpr_32
> %6:vgpr_32 = COPY %1:sgpr_32
> %4:vgpr_32 = V_BFI_B32_e64 killed %3:sreg_32, %5:vgpr_32, %6:vgpr_32, implicit $exec
> $sgpr0 = COPY %4:vgpr_32
> SI_RETURN_TO_EPILOG $sgpr0
---
Full diff: https://github.com/llvm/llvm-project/pull/131752.diff
1 Files Affected:
- (added) llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll (+25)
``````````diff
diff --git a/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll b/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll
new file mode 100644
index 0000000000000..6b14a660f580d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; XFAIL: *
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx1100 -verify-machineinstrs | FileCheck %s -check-prefixes=GCN
+
+define amdgpu_ps i32 @s_copysign_f32_bf16(float inreg %mag, bfloat inreg %sign.bf16) {
+ %sign = fpext bfloat %sign.bf16 to float
+ %op = call float @llvm.copysign.f32(float %mag, float %sign)
+ %cast = bitcast float %op to i32
+ ret i32 %cast
+}
+
+; define i32 @s_copysign_f32_bf16(float %mag, bfloat %sign.bf16) {
+; %sign = fpext bfloat %sign.bf16 to float
+; %op = call float @llvm.copysign.f32(float %mag, float %sign)
+; %cast = bitcast float %op to i32
+; ret i32 %cast
+; }
+
+; define i32 @s_copysign_f32_bf16(float inreg %mag, bfloat inreg %sign.bf16) {
+; %sign = fpext bfloat %sign.bf16 to float
+; %op = call float @llvm.copysign.f32(float %mag, float %sign)
+; %cast = bitcast float %op to i32
+; %readlane = call i32 @llvm.amdgcn.readfirstlane(i32 %cast)
+; ret i32 %readlane
+; }
\ No newline at end of file
``````````
</details>
https://github.com/llvm/llvm-project/pull/131752
More information about the llvm-commits
mailing list