[PATCH] D151997: [AMDGPU] Document amdgpu_cs_chain[_preserve] CCs. NFC
Diana Picus via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 8 00:56:00 PDT 2023
rovka updated this revision to Diff 529522.
rovka added a comment.
Address some of the previous comments and alphabetise.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D151997/new/
https://reviews.llvm.org/D151997
Files:
llvm/docs/AMDGPUUsage.rst
Index: llvm/docs/AMDGPUUsage.rst
===================================================================
--- llvm/docs/AMDGPUUsage.rst
+++ llvm/docs/AMDGPUUsage.rst
@@ -1070,6 +1070,49 @@
..TODO::
Describe.
+ ``amdgpu_cs_chain`` Similar to ``amdgpu_cs``, with differences described below.
+
+ Functions with this calling convention cannot be called directly. They must
+ instead be launched via the ``llvm.amdgcn.cs.chain`` intrinsic.
+
+ Arguments are passed in SGPRs, starting at s0, if they have the ``inreg``
+ attribute, and in VGPRs otherwise, starting at v8. Using more SGPRs or VGPRs
+ than available in the subtarget is not allowed. On subtargets that use
+ a scratch buffer descriptor (as opposed to ``scratch_{load,store}_*`` instructions),
+ the scratch buffer descriptor is passed in s[48:51]. This limits the
+ SGPR / ``inreg`` arguments to the equivalent of 48 dwords; using more
+ than that is not allowed.
+
+ The return type must be void.
+ Varargs, sret, byval, byref, inalloca, preallocated are not supported.
+
+ Values in scalar registers as well as v0-v7 are not preserved. Values in
+ VGPRs starting at v8 are not preserved for the active lanes, but must be
+ saved by the callee for inactive lanes when using WWM.
+
+ Wave scratch is "empty" at function boundaries. There is no stack pointer input
+ or output value, but functions are free to use scratch starting from an initial
+ stack pointer. Calls to ``amdgpu_gfx`` functions are allowed and behave like they
+ do in ``amdgpu_cs`` functions.
+
+ All counters (``lgkmcnt``, ``vmcnt``, ``storecnt``, etc.) are presumed in an
+ unknown state at function entry. Waits for regular memory counters are not
+ inserted as part of an ``llvm.amdgcn.cs.chain`` sequence in the function epilog.
+ However, we add waits for errata / hardware workarounds in the epilog:
+
+ * On gfx11+, the function epilog waits for any scratch stores to be confirmed. This
+ works around the issue that we must wait for scratch stores before sending a
+ ``MSG_DEALLOC_VGPRS`` message.
+ * Additional waits may be required (e.g. ``s_waitcnt_depctr``).
+
+ A function may have multiple exits (e.g. one chain exit and one plain ``ret void``
+ for when the wave ends), but all ``llvm.amdgcn.cs.chain`` exits must be in
+ uniform control flow.
+
+ Functions must be aligned to at least 64 bytes.
+
+ ``amdgpu_cs_chain_preserve`` Same as ``amdgpu_cs_chain``, but active lanes for VGPRs starting at v8 are preserved.
+
``amdgpu_es`` Used for AMDPAL shader stage before geometry shader if geometry is in
use. So either the domain (= tessellation evaluation) shader if
tessellation is in use, or otherwise the vertex shader.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D151997.529522.patch
Type: text/x-patch
Size: 3949 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230608/1e490c64/attachment.bin>
More information about the llvm-commits
mailing list