[PATCH] D25183: AMDGPU: Assume spilling will occur at -O0

Mon Oct 3 05:46:13 PDT 2016

arsenm created this revision.
arsenm added a subscriber: llvm-commits.
Herald added subscribers: tony-tye, yaxunl, nhaehnle, wdng, kzhuravl, qcolombet.
Herald added a reviewer: tstellarAMD.

Because everything live is spilled at the end of a
block by fast regalloc, assume this will happen and
avoid the copies of the resource descriptor.


https://reviews.llvm.org/D25183

Files:
  lib/Target/AMDGPU/SIISelLowering.cpp
  test/CodeGen/AMDGPU/private-access-no-objects.ll
  test/CodeGen/AMDGPU/spill-m0.ll


Index: test/CodeGen/AMDGPU/spill-m0.ll
===================================================================

--- test/CodeGen/AMDGPU/spill-m0.ll
+++ test/CodeGen/AMDGPU/spill-m0.ll
@@ -18,7 +18,7 @@
 
 ; TOSMEM: s_mov_b32 vcc_hi, m0
 ; TOSMEM-NOT: vcc_hi
-; TOSMEM: s_buffer_store_dword vcc_hi, s[84:87], s89 ; 4-byte Folded Spill
+; TOSMEM: s_buffer_store_dword vcc_hi, s[84:87], s3 ; 4-byte Folded Spill
 ; TOSMEM: s_waitcnt lgkmcnt(0)
 
 ; GCN: s_cbranch_scc1 [[ENDIF:BB[0-9]+_[0-9]+]]
@@ -32,7 +32,7 @@
 ; TOVMEM: v_readfirstlane_b32 vcc_hi, [[RELOAD_VREG]]
 ; TOVMEM: s_mov_b32 m0, vcc_hi
 
-; TOSMEM: s_buffer_load_dword vcc_hi, s[84:87], s89 ; 4-byte Folded Reload
+; TOSMEM: s_buffer_load_dword vcc_hi, s[84:87], s3 ; 4-byte Folded Reload
 ; TOSMEM-NOT: vcc_hi
 ; TOSMEM: s_mov_b32 m0, vcc_hi
 
Index: test/CodeGen/AMDGPU/private-access-no-objects.ll
===================================================================
--- /dev/null
+++ test/CodeGen/AMDGPU/private-access-no-objects.ll
@@ -0,0 +1,33 @@
+; RUN: llc -O0 -mtriple=amdgcn--amdhsa -mcpu=fiji -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=VI -check-prefix=OPTNONE %s
+
+; GCN-LABEL: {{^}}store_to_undef:
+
+; -O0 should assume spilling, so the input scratch resource descriptor
+; -should be used directly without any copies.
+
+; OPTNONE-NOT: s_mov_b32
+; OPTNONE: buffer_store_dword v{{[0-9]+}}, v{{[0-9]+}}, s[0:3], s7 offen{{$}}
+define void @store_to_undef() #0 {
+  store volatile i32 0, i32* undef
+  ret void
+}
+
+; GCN-LABEL: {{^}}store_to_inttoptr:
+define void @store_to_inttoptr() #0 {
+ store volatile i32 0, i32* inttoptr (i32 123 to i32*)
+ ret void
+}
+
+; GCN-LABEL: {{^}}load_from_undef:
+define void @load_from_undef() #0 {
+  %ld = load volatile i32, i32* undef
+  ret void
+}
+
+; GCN-LABEL: {{^}}load_from_inttoptr:
+define void @load_from_inttoptr() #0 {
+  %ld = load volatile i32, i32* inttoptr (i32 123 to i32*)
+  ret void
+}
+
+attributes #0 = { nounwind }
Index: lib/Target/AMDGPU/SIISelLowering.cpp
===================================================================
--- lib/Target/AMDGPU/SIISelLowering.cpp
+++ lib/Target/AMDGPU/SIISelLowering.cpp
@@ -875,8 +875,12 @@
   if (HasStackObjects)
     Info->setHasNonSpillStackObjects(true);
 
+  // Everything live out of a block is spilled with fast regalloc, so it's
+  // almost certain that spilling will be required.
+  if (getTargetMachine().getOptLevel() == CodeGenOpt::None)
+    HasStackObjects = true;
+
   if (ST.isAmdCodeObjectV2()) {
-    // TODO: Assume we will spill without optimizations.
     if (HasStackObjects) {
       // If we have stack objects, we unquestionably need the private buffer
       // resource. For the Code Object V2 ABI, this will be the first 4 user


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D25183.73263.patch
Type: text/x-patch
Size: 2760 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161003/373b8430/attachment.bin>