From: Scott Linder Date: Fri, 29 Mar 2019 17:49:51 +0000 (+0000) Subject: [AMDGPU] Add an additional Code Object V3 assembler example X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=6209ea9ff71d9d151e55520e9c6bbf7b3766d88f;p=llvm [AMDGPU] Add an additional Code Object V3 assembler example Document the intended use of the `.amdgcn.next_free_{s,v}gpr` in the context of multiple kernels and functions. Differential Revision: https://reviews.llvm.org/D59949 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357289 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/docs/AMDGPUUsage.rst b/docs/AMDGPUUsage.rst index 598eea10b53..72897f03cee 100644 --- a/docs/AMDGPUUsage.rst +++ b/docs/AMDGPUUsage.rst @@ -5019,6 +5019,8 @@ For example, when assembling for a "GFX704" target this will be set to the integer value "4". The possible GFX stepping generation numbers are presented in :ref:`amdgpu-processors`. +.. _amdgpu-amdhsa-assembler-symbol-next_free_vgpr: + .amdgcn.next_free_vgpr ++++++++++++++++++++++ @@ -5032,6 +5034,8 @@ May be used to set the `.amdhsa_next_free_vpgr` directive in May be set at any time, e.g. manually set to zero at the start of each kernel. +.. _amdgpu-amdhsa-assembler-symbol-next_free_sgpr: + .amdgcn.next_free_sgpr ++++++++++++++++++++++ @@ -5241,6 +5245,80 @@ Here is an example of a minimal assembly source file, defining one HSA kernel: ... .end_amdgpu_metadata +If an assembly source file contains multiple kernels and/or functions, the +:ref:`amdgpu-amdhsa-assembler-symbol-next_free_vgpr` and +:ref:`amdgpu-amdhsa-assembler-symbol-next_free_sgpr` symbols may be reset using +the ``.set , `` directive. For example, in the case of two +kernels, where ``function1`` is only called from ``kernel1`` it is sufficient +to group the function with the kernel that calls it and reset the symbols +between the two connected components: + +.. code-block:: none + + .amdgcn_target "amdgcn-amd-amdhsa--gfx900+xnack" // optional + + // gpr tracking symbols are implicitly set to zero + + .text + .globl kern0 + .p2align 8 + .type kern0,@function + kern0: + // ... + s_endpgm + .Lkern0_end: + .size kern0, .Lkern0_end-kern0 + + .rodata + .p2align 6 + .amdhsa_kernel kern0 + // ... + .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr + .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr + .end_amdhsa_kernel + + // reset symbols to begin tracking usage in func1 and kern1 + .set .amdgcn.next_free_vgpr, 0 + .set .amdgcn.next_free_sgpr, 0 + + .text + .hidden func1 + .global func1 + .p2align 2 + .type func1,@function + func1: + // ... + s_setpc_b64 s[30:31] + .Lfunc1_end: + .size func1, .Lfunc1_end-func1 + + .globl kern1 + .p2align 8 + .type kern1,@function + kern1: + // ... + s_getpc_b64 s[4:5] + s_add_u32 s4, s4, func1@rel32@lo+4 + s_addc_u32 s5, s5, func1@rel32@lo+4 + s_swappc_b64 s[30:31], s[4:5] + // ... + s_endpgm + .Lkern1_end: + .size kern1, .Lkern1_end-kern1 + + .rodata + .p2align 6 + .amdhsa_kernel kern1 + // ... + .amdhsa_next_free_vgpr .amdgcn.next_free_vgpr + .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr + .end_amdhsa_kernel + +These symbols cannot identify connected components in order to automatically +track the usage for each kernel. However, in some cases careful organization of +the kernels and functions in the source file means there is minimal additional +effort required to accurately calculate GPR usage. + Additional Documentation ========================