[llvm] r357289 - [AMDGPU] Add an additional Code Object V3 assembler example
Scott Linder via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 29 10:49:51 PDT 2019
Author: scott.linder
Date: Fri Mar 29 10:49:51 2019
New Revision: 357289
URL: http://llvm.org/viewvc/llvm-project?rev=357289&view=rev
Log:
[AMDGPU] Add an additional Code Object V3 assembler example
Document the intended use of the `.amdgcn.next_free_{s,v}gpr` in the
context of multiple kernels and functions.
Differential Revision: https://reviews.llvm.org/D59949
Modified:
llvm/trunk/docs/AMDGPUUsage.rst
Modified: llvm/trunk/docs/AMDGPUUsage.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/AMDGPUUsage.rst?rev=357289&r1=357288&r2=357289&view=diff
==============================================================================
--- llvm/trunk/docs/AMDGPUUsage.rst (original)
+++ llvm/trunk/docs/AMDGPUUsage.rst Fri Mar 29 10:49:51 2019
@@ -5019,6 +5019,8 @@ For example, when assembling for a "GFX7
integer value "4". The possible GFX stepping generation numbers are presented
in :ref:`amdgpu-processors`.
+.. _amdgpu-amdhsa-assembler-symbol-next_free_vgpr:
+
.amdgcn.next_free_vgpr
++++++++++++++++++++++
@@ -5032,6 +5034,8 @@ May be used to set the `.amdhsa_next_fre
May be set at any time, e.g. manually set to zero at the start of each kernel.
+.. _amdgpu-amdhsa-assembler-symbol-next_free_sgpr:
+
.amdgcn.next_free_sgpr
++++++++++++++++++++++
@@ -5241,6 +5245,80 @@ Here is an example of a minimal assembly
...
.end_amdgpu_metadata
+If an assembly source file contains multiple kernels and/or functions, the
+:ref:`amdgpu-amdhsa-assembler-symbol-next_free_vgpr` and
+:ref:`amdgpu-amdhsa-assembler-symbol-next_free_sgpr` symbols may be reset using
+the ``.set <symbol>, <expression>`` directive. For example, in the case of two
+kernels, where ``function1`` is only called from ``kernel1`` it is sufficient
+to group the function with the kernel that calls it and reset the symbols
+between the two connected components:
+
+.. code-block:: none
+
+ .amdgcn_target "amdgcn-amd-amdhsa--gfx900+xnack" // optional
+
+ // gpr tracking symbols are implicitly set to zero
+
+ .text
+ .globl kern0
+ .p2align 8
+ .type kern0, at function
+ kern0:
+ // ...
+ s_endpgm
+ .Lkern0_end:
+ .size kern0, .Lkern0_end-kern0
+
+ .rodata
+ .p2align 6
+ .amdhsa_kernel kern0
+ // ...
+ .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+ .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+ .end_amdhsa_kernel
+
+ // reset symbols to begin tracking usage in func1 and kern1
+ .set .amdgcn.next_free_vgpr, 0
+ .set .amdgcn.next_free_sgpr, 0
+
+ .text
+ .hidden func1
+ .global func1
+ .p2align 2
+ .type func1, at function
+ func1:
+ // ...
+ s_setpc_b64 s[30:31]
+ .Lfunc1_end:
+ .size func1, .Lfunc1_end-func1
+
+ .globl kern1
+ .p2align 8
+ .type kern1, at function
+ kern1:
+ // ...
+ s_getpc_b64 s[4:5]
+ s_add_u32 s4, s4, func1 at rel32@lo+4
+ s_addc_u32 s5, s5, func1 at rel32@lo+4
+ s_swappc_b64 s[30:31], s[4:5]
+ // ...
+ s_endpgm
+ .Lkern1_end:
+ .size kern1, .Lkern1_end-kern1
+
+ .rodata
+ .p2align 6
+ .amdhsa_kernel kern1
+ // ...
+ .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr
+ .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr
+ .end_amdhsa_kernel
+
+These symbols cannot identify connected components in order to automatically
+track the usage for each kernel. However, in some cases careful organization of
+the kernels and functions in the source file means there is minimal additional
+effort required to accurately calculate GPR usage.
+
Additional Documentation
========================
More information about the llvm-commits
mailing list