[PATCH] D65236: [AMDGPU] Increase kernel padding

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 24 11:29:07 PDT 2019


rampitec created this revision.
rampitec added reviewers: kzhuravl, msearles.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, arsenm.

To support prefetch mode 3 we need to pad current
cacheline and fill 3 cachelines after. Current padding
is only sufficient for mode 2.


https://reviews.llvm.org/D65236

Files:
  lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
  test/CodeGen/AMDGPU/s_code_end.ll


Index: test/CodeGen/AMDGPU/s_code_end.ll
===================================================================
--- test/CodeGen/AMDGPU/s_code_end.ll
+++ test/CodeGen/AMDGPU/s_code_end.ll
@@ -35,47 +35,14 @@
 ; GCN-ASM-NEXT:   [[END_LABEL3:\.Lfunc_end.*]]:
 ; GCN-ASM-NEXT:           .size   a_function, [[END_LABEL3]]-a_function
 ; GFX10END-ASM:           .p2alignl 6, 3214868480
-; GFX10END-ASM-NEXT:      .fill 32, 4, 3214868480
+; GFX10END-ASM-NEXT:      .fill 48, 4, 3214868480
 ; GFX10NOEND-NOT:         .fill
 
 ; GFX10NOEND-OBJ-NOT:     s_code_end
 ; GFX10END-OBJ-NEXT:      s_code_end
 
 ; GFX10END-OBJ:           s_code_end // 000000000140:
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
-; GFX10END-OBJ-NEXT:      s_code_end
+; GFX10END-OBJ-NEXT-COUNT47: s_code_end
 
 define void @a_function() {
   ret void
Index: lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
===================================================================
--- lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+++ lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
@@ -250,7 +250,7 @@
 bool AMDGPUTargetAsmStreamer::EmitCodeEnd() {
   const uint32_t Encoded_s_code_end = 0xbf9f0000;
   OS << "\t.p2alignl 6, " << Encoded_s_code_end << '\n';
-  OS << "\t.fill 32, 4, " << Encoded_s_code_end << '\n';
+  OS << "\t.fill 48, 4, " << Encoded_s_code_end << '\n';
   return true;
 }
 
@@ -602,7 +602,7 @@
   MCStreamer &OS = getStreamer();
   OS.PushSection();
   OS.EmitValueToAlignment(64, Encoded_s_code_end, 4);
-  for (unsigned I = 0; I < 32; ++I)
+  for (unsigned I = 0; I < 48; ++I)
     OS.EmitIntValue(Encoded_s_code_end, 4);
   OS.PopSection();
   return true;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D65236.211573.patch
Type: text/x-patch
Size: 2766 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190724/4d2fbde0/attachment-0001.bin>


More information about the llvm-commits mailing list