[llvm] [AMDGPU][NFC] Precommit tests representing spills. (PR #115270)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 6 22:15:55 PST 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Pravin Jagtap (pravinjagtap)
<details>
<summary>Changes</summary>
Presently we are only marking implicit-def for the
spilled AGPR tuple in the first spill instructions
and not implicit.
---
Full diff: https://github.com/llvm/llvm-project/pull/115270.diff
2 Files Affected:
- (added) llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir (+69)
- (added) llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir (+70)
``````````diff
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
new file mode 100644
index 00000000000000..f7680c418b8fa9
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
@@ -0,0 +1,69 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog,machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+# When VGPRs are available for spilling, prologepilog marks the tuple implicit-def as well as implicit in the first spill instruction.
+# As a consequence, machine-cp would NOT delete agpr2 copy here.
+
+---
+name: agpr-spill-to-vgpr-machine-cp
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr-machine-cp
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
+
+# When VGPRs are NOT available for spilling (stack is used), prologepilog marks the tuple implicit-def only and NOT implicit.
+# As a consequence, machine-cp would delete agpr2 copy here.
+
+---
+name: agpr-spill-to-vgpr-to-stack-machine-cp
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack-machine-cp
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+ ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
new file mode 100644
index 00000000000000..50bba1baed85f4
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
@@ -0,0 +1,70 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+# During spill expansion, when VGPRs are available for spilling (stack is unused), tuple is being marked as
+# implicit-def as well as implicit in the first spill instrunction.
+
+---
+name: agpr-spill-to-vgpr
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
+
+# During spill expansion, when VGPRs are NOT available for spilling (stack is used), tuple is being marked as
+# implicit-def ONLY and NOT implicit in the first spill instrunction.
+
+---
+name: agpr-spill-to-vgpr-to-stack
+tracksRegLiveness: true
+stack:
+ - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+ scratchRSrcReg: $sgpr0_sgpr1_sgpr2_sgpr3
+ stackPtrOffsetReg: '$sgpr32'
+ hasSpilledVGPRs: true
+body: |
+ bb.0:
+ successors:
+ liveins: $vgpr0, $vgpr1
+ ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack
+ ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+ ; GFX908-NEXT: {{ $}}
+ ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+ ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+ ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+ ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+ ; GFX908-NEXT: S_ENDPGM 0
+ renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+ renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+ $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+ $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+ SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+ S_ENDPGM 0
+...
``````````
</details>
https://github.com/llvm/llvm-project/pull/115270
More information about the llvm-commits
mailing list