[llvm] r357289 - [AMDGPU] Add an additional Code Object V3 assembler example

Scott Linder via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 29 10:49:51 PDT 2019


Author: scott.linder
Date: Fri Mar 29 10:49:51 2019
New Revision: 357289

URL: http://llvm.org/viewvc/llvm-project?rev=357289&view=rev
Log:
[AMDGPU] Add an additional Code Object V3 assembler example

Document the intended use of the `.amdgcn.next_free_{s,v}gpr` in the
context of multiple kernels and functions.

Differential Revision: https://reviews.llvm.org/D59949

Modified:
    llvm/trunk/docs/AMDGPUUsage.rst

Modified: llvm/trunk/docs/AMDGPUUsage.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/AMDGPUUsage.rst?rev=357289&r1=357288&r2=357289&view=diff
==============================================================================
--- llvm/trunk/docs/AMDGPUUsage.rst (original)
+++ llvm/trunk/docs/AMDGPUUsage.rst Fri Mar 29 10:49:51 2019
@@ -5019,6 +5019,8 @@ For example, when assembling for a "GFX7
 integer value "4". The possible GFX stepping generation numbers are presented
 in :ref:`amdgpu-processors`.
 
+.. _amdgpu-amdhsa-assembler-symbol-next_free_vgpr:
+
 .amdgcn.next_free_vgpr
 ++++++++++++++++++++++
 
@@ -5032,6 +5034,8 @@ May be used to set the `.amdhsa_next_fre
 
 May be set at any time, e.g. manually set to zero at the start of each kernel.
 
+.. _amdgpu-amdhsa-assembler-symbol-next_free_sgpr:
+
 .amdgcn.next_free_sgpr
 ++++++++++++++++++++++
 
@@ -5241,6 +5245,80 @@ Here is an example of a minimal assembly
   ...
   .end_amdgpu_metadata
 
+If an assembly source file contains multiple kernels and/or functions, the
+:ref:`amdgpu-amdhsa-assembler-symbol-next_free_vgpr` and
+:ref:`amdgpu-amdhsa-assembler-symbol-next_free_sgpr` symbols may be reset using
+the ``.set <symbol>, <expression>`` directive. For example, in the case of two
+kernels, where ``function1`` is only called from ``kernel1`` it is sufficient
+to group the function with the kernel that calls it and reset the symbols
+between the two connected components:
+
+.. code-block:: none
+
+  .amdgcn_target "amdgcn-amd-amdhsa--gfx900+xnack" // optional
+
+  // gpr tracking symbols are implicitly set to zero
+
+  .text
+  .globl kern0
+  .p2align 8
+  .type kern0, at function
+  kern0:
+    // ...
+    s_endpgm
+  .Lkern0_end:
+    .size   kern0, .Lkern0_end-kern0
+
+  .rodata
+  .p2align 6
+  .amdhsa_kernel kern0
+    // ...
+    .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+    .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+  .end_amdhsa_kernel
+
+  // reset symbols to begin tracking usage in func1 and kern1
+  .set .amdgcn.next_free_vgpr, 0
+  .set .amdgcn.next_free_sgpr, 0
+
+  .text
+  .hidden func1
+  .global func1
+  .p2align 2
+  .type func1, at function
+  func1:
+    // ...
+    s_setpc_b64 s[30:31]
+  .Lfunc1_end:
+  .size func1, .Lfunc1_end-func1
+
+  .globl kern1
+  .p2align 8
+  .type kern1, at function
+  kern1:
+    // ...
+    s_getpc_b64 s[4:5]
+    s_add_u32 s4, s4, func1 at rel32@lo+4
+    s_addc_u32 s5, s5, func1 at rel32@lo+4
+    s_swappc_b64 s[30:31], s[4:5]
+    // ...
+    s_endpgm
+  .Lkern1_end:
+    .size   kern1, .Lkern1_end-kern1
+
+  .rodata
+  .p2align 6
+  .amdhsa_kernel kern1
+    // ...
+    .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+    .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+  .end_amdhsa_kernel
+
+These symbols cannot identify connected components in order to automatically
+track the usage for each kernel. However, in some cases careful organization of
+the kernels and functions in the source file means there is minimal additional
+effort required to accurately calculate GPR usage.
+
 Additional Documentation
 ========================
 




More information about the llvm-commits mailing list