[llvm] [AMDGPU] Marking super-reg as implicit-def in first spill instruction (PR #114773)

Pravin Jagtap via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 6 08:16:42 PST 2024


https://github.com/pravinjagtap updated https://github.com/llvm/llvm-project/pull/114773

>From 8f451842cbd687fa746922848fd792d96970455e Mon Sep 17 00:00:00 2001
From: Pravin Jagtap <Pravin.Jagtap at amd.com>
Date: Mon, 4 Nov 2024 16:28:36 +0530
Subject: [PATCH 1/4] [AMDGPU] Machine-CP is deleting incorrect copy instr.

During AGPR tuple SPILLing, AMDGPU backend marks
the tuple as implicit-def in the first spill
instruction. It clobbers the register and the
machine-cp thinks its def i.e. copy is deletable
which I think is incorrect. Seeking help
to understand whether following copy is
deletable here ? (given the fact that compiler
decided to mark it implicit-def for preserving the
liveness)

renamable $agpr1 = COPY renamable $agpr3, implicit $exec
---
 .../AMDGPU/incorrect-copy-deletion.mir        | 24 +++++++++++++++++++
 1 file changed, 24 insertions(+)
 create mode 100644 llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir

diff --git a/llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir b/llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir
new file mode 100644
index 00000000000000..a4ec8ca825aee4
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir
@@ -0,0 +1,24 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+---
+name:  foo
+tracksRegLiveness: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $agpr3
+
+    ; GFX908-LABEL: name: foo
+    ; GFX908: liveins: $vgpr0, $agpr3
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: $vgpr254 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1
+    ; GFX908-NEXT: $vgpr255 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr1 = COPY renamable $agpr3, implicit $exec
+    $vgpr254 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1
+    $vgpr255 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    S_ENDPGM 0
+...

>From 59e7bae439fa7327f0082ad6c2703926063a91d5 Mon Sep 17 00:00:00 2001
From: Pravin Jagtap <Pravin.Jagtap at amd.com>
Date: Tue, 5 Nov 2024 18:02:48 +0530
Subject: [PATCH 2/4] Added a test to represent prologepilog marks the src
 tuple as implicit-def while expanding of SI_SPILL_AV96_SAVE.

---
 .../AMDGPU/av-spill-implicit-def-tuple.mir    | 31 +++++++++++++++++++
 .../AMDGPU/incorrect-copy-deletion.mir        | 24 --------------
 2 files changed, 31 insertions(+), 24 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir
 delete mode 100644 llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir

diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir b/llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir
new file mode 100644
index 00000000000000..ec5d3a75c8d356
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir
@@ -0,0 +1,31 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+---
+name:  test_implicit_def
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 12, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+
+    ; GFX908-LABEL: name: test_implicit_def
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr4 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: $vgpr3 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr2 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir b/llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir
deleted file mode 100644
index a4ec8ca825aee4..00000000000000
--- a/llvm/test/CodeGen/AMDGPU/incorrect-copy-deletion.mir
+++ /dev/null
@@ -1,24 +0,0 @@
-# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
-
----
-name:  foo
-tracksRegLiveness: true
-body: |
-  bb.0:
-    successors:
-    liveins: $vgpr0, $agpr3
-
-    ; GFX908-LABEL: name: foo
-    ; GFX908: liveins: $vgpr0, $agpr3
-    ; GFX908-NEXT: {{  $}}
-    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
-    ; GFX908-NEXT: $vgpr254 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1
-    ; GFX908-NEXT: $vgpr255 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
-    ; GFX908-NEXT: S_ENDPGM 0
-    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
-    renamable $agpr1 = COPY renamable $agpr3, implicit $exec
-    $vgpr254 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1
-    $vgpr255 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
-    S_ENDPGM 0
-...

>From 7d02a1b753ca8c89b53b1affdd70127da7192172 Mon Sep 17 00:00:00 2001
From: Pravin Jagtap <Pravin.Jagtap at amd.com>
Date: Wed, 6 Nov 2024 14:34:45 +0530
Subject: [PATCH 3/4] Modified tests to see the effect of implicit and
 implicit-def during spill expansion on machine-cp

---
 .../av-spill-expansion-with-machine-cp.mir    | 66 ++++++++++++++++++
 .../AMDGPU/av-spill-implicit-def-tuple.mir    | 31 ---------
 .../AMDGPU/av-spill-to-vgpr-and-stack.mir     | 67 +++++++++++++++++++
 test.mir                                      | 16 +++++
 4 files changed, 149 insertions(+), 31 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
 delete mode 100644 llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir
 create mode 100644 llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
 create mode 100644 test.mir

diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
new file mode 100644
index 00000000000000..7551d48e713f42
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
@@ -0,0 +1,66 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog,machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+---
+name:  agpr-spill-to-vgpr-machine-cp
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr-machine-cp
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
+
+---
+name:  agpr-spill-to-vgpr-to-stack-machine-cp
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack-machine-cp
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+    ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+
+    $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir b/llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir
deleted file mode 100644
index ec5d3a75c8d356..00000000000000
--- a/llvm/test/CodeGen/AMDGPU/av-spill-implicit-def-tuple.mir
+++ /dev/null
@@ -1,31 +0,0 @@
-# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
-
----
-name:  test_implicit_def
-tracksRegLiveness: true
-stack:
-  - { id: 0, name: '', type: spill-slot, offset: 0, size: 12, alignment: 4 }
-machineFunctionInfo:
-  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
-  stackPtrOffsetReg: '$sgpr32'
-  hasSpilledVGPRs: true
-body: |
-  bb.0:
-    successors:
-    liveins: $vgpr0, $vgpr1
-
-    ; GFX908-LABEL: name: test_implicit_def
-    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4
-    ; GFX908-NEXT: {{  $}}
-    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
-    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
-    ; GFX908-NEXT: $vgpr4 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
-    ; GFX908-NEXT: $vgpr3 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
-    ; GFX908-NEXT: $vgpr2 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
-    ; GFX908-NEXT: S_ENDPGM 0
-    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
-    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
-    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
-    S_ENDPGM 0
-...
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
new file mode 100644
index 00000000000000..82f501456b5c6d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
@@ -0,0 +1,67 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
+
+---
+name:  agpr-spill-to-vgpr
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5, $vgpr6, $vgpr7, $vgpr8, $vgpr9, $vgpr10, $vgpr11, $vgpr12, $vgpr13, $vgpr14, $vgpr15, $vgpr16, $vgpr17, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr33 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: $vgpr32 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr31 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
+
+---
+name:  agpr-spill-to-vgpr-to-stack
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 128, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    ; GFX908-LABEL: name: agpr-spill-to-vgpr-to-stack
+    ; GFX908: liveins: $vgpr0, $vgpr1, $vgpr18, $vgpr19, $vgpr20, $vgpr21, $vgpr22, $vgpr23, $vgpr24, $vgpr25, $vgpr26, $vgpr27, $vgpr28, $vgpr29, $vgpr30, $vgpr31, $vgpr32, $vgpr33, $vgpr34, $vgpr35, $vgpr36, $vgpr37, $vgpr38, $vgpr39, $vgpr48, $vgpr49, $vgpr50, $vgpr51, $vgpr52, $vgpr53, $vgpr54, $vgpr55
+    ; GFX908-NEXT: {{  $}}
+    ; GFX908-NEXT: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    ; GFX908-NEXT: renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+    ; GFX908-NEXT: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit-def $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec, implicit $agpr0_agpr1_agpr2 :: (store (s32) into %stack.0, addrspace 5)
+    ; GFX908-NEXT: $vgpr40 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec
+    ; GFX908-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.0 + 4, addrspace 5)
+    ; GFX908-NEXT: $vgpr55 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr0_agpr1_agpr2
+    ; GFX908-NEXT: S_ENDPGM 0
+    renamable $agpr0 = COPY renamable $vgpr0, implicit $exec
+    renamable $agpr2 = COPY renamable $vgpr1, implicit $exec
+
+    $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9 = IMPLICIT_DEF
+    $vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = IMPLICIT_DEF
+
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...
diff --git a/test.mir b/test.mir
new file mode 100644
index 00000000000000..a8dd48b6d23688
--- /dev/null
+++ b/test.mir
@@ -0,0 +1,16 @@
+---
+name:  test_implicit_def
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: '', type: spill-slot, offset: 0, size: 12, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  $sgpr0_sgpr1_sgpr2_sgpr3
+  stackPtrOffsetReg: '$sgpr32'
+  hasSpilledVGPRs: true
+body: |
+  bb.0:
+    successors:
+    liveins: $vgpr0, $vgpr1
+    SI_SPILL_AV96_SAVE $agpr0_agpr1_agpr2, %stack.0, $sgpr32, 0, implicit $exec :: (store (s96) into %stack.0, align 4, addrspace 5)
+    S_ENDPGM 0
+...

>From f8dced0a28d3f3975f90d791a4c0bcf6ee23c427 Mon Sep 17 00:00:00 2001
From: Pravin Jagtap <Pravin.Jagtap at amd.com>
Date: Wed, 6 Nov 2024 21:43:51 +0530
Subject: [PATCH 4/4] Added observations for tests.

---
 .../CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir   | 6 ++++++
 llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir     | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
index 7551d48e713f42..3378ae71987ae1 100644
--- a/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-expansion-with-machine-cp.mir
@@ -1,6 +1,9 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
 # RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog,machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
 
+# When VGPRs are available for spilling, prologepilog marks the tuple implicit-def as well as implicit in the first spill instruction.
+# As a consequence, machine-cp would NOT delete agpr2 copy here.
+
 ---
 name:  agpr-spill-to-vgpr-machine-cp
 tracksRegLiveness: true
@@ -30,6 +33,9 @@ body: |
     S_ENDPGM 0
 ...
 
+# When VGPRs are NOT available for spilling (stack is used), prologepilog marks the tuple implicit-def only and NOT implicit.
+# As a consequence, machine-cp would delete agpr2 copy here.
+
 ---
 name:  agpr-spill-to-vgpr-to-stack-machine-cp
 tracksRegLiveness: true
diff --git a/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
index 82f501456b5c6d..501683c49b0008 100644
--- a/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
+++ b/llvm/test/CodeGen/AMDGPU/av-spill-to-vgpr-and-stack.mir
@@ -1,6 +1,9 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
 # RUN: llc -mtriple=amdgcn -mcpu=gfx908 %s -o - -run-pass prologepilog -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s
 
+# During spill expansion, when VGPRs are available for spilling (stack is unused), tuple is being marked as
+# implicit-def as well as implicit in the first spill instrunction.
+
 ---
 name:  agpr-spill-to-vgpr
 tracksRegLiveness: true
@@ -30,6 +33,9 @@ body: |
     S_ENDPGM 0
 ...
 
+# During spill expansion, when VGPRs are NOT available for spilling (stack is used), tuple is being marked as
+# implicit-def ONLY and NOT implicit in the first spill instrunction.
+
 ---
 name:  agpr-spill-to-vgpr-to-stack
 tracksRegLiveness: true



More information about the llvm-commits mailing list