[llvm] AMDGPU: Add description for amdgpu.no.access.location.types metadata (PR #85052)
Pierre-Andre Saulais via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 14 09:50:44 PDT 2024
================
@@ -1312,24 +1312,73 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
List AMDGPU intrinsics.
+.. _amdgpu_metadata:
+
LLVM IR Metadata
-------------------
+================
+
+The AMDGPU backend implements the following target custom LLVM IR
+metadata.
+
+.. _amdgpu_last_use:
+
+'``amdgpu.last.use``' Metadata
+------------------------------
+
+Sets TH_LOAD_LU temporal hint on load instructions that support it.
+Takes priority over nontemporal hint (TH_LOAD_NT). This takes no
+arguments.
+
+.. code-block:: llvm
+
+ %val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
-The AMDGPU backend implements the following LLVM IR metadata.
+.. _amdgpu_no_access_location_types:
-.. list-table:: AMDGPU LLVM IR Metatdata
- :name: amdgpu-llvm-ir-metadata-table
+'``amdgpu.no.access.location.types``' Metadata
+----------------------------------------------
- * - Metadata Name
+Asserts a memory access does not access bytes residing in certain
+allocation kinds. This is intended for use with :ref:`atomicrmw
+<i_atomicrmw>` and other atomic instructions. This is required to emit
+a native hardware instruction for some :ref:`system scope
+<amdgpu-memory-scopes>` atomic operations on some subtargets. An
+:ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
+conservatively as required to preserve the operation behavior in all
+cases.
+
+If the memory operation does access an address in an indicated region,
+any stored values and any returned results are :ref:`poison
+<poisonvalues>`. This has a single integer argument, interpreted as a
+bitfield. A 0 value is equivalent to removing the metadata.
+
+.. list-table::
+
+ * - Bit
- Description
- - Values
- * - !amdgpu.last.use
- - Sets TH_LOAD_LU temporal hint on load instructions that support it.
- Takes priority over nontemporal hint (TH_LOAD_NT).
- - {}
+ * - 0
+ - Not in fine-grained host memory.
+ * - 1
+ - Not in a remote connected peer device (address must be device local)
+
+.. code-block:: llvm
+
+ ; Indicates the access does not access fine-grained memory, or
+ ; remote device memory.
+ %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.access.location.types !0
+
+ ; Indicates the access does not access fine-grained memory.
+ %old1 = atomicrmw sub ptr %ptr1, i32 1 acquire, !amdgpu.no.access.location.types !1
+
+ ; Indicates the access does not access peer device memory.
+ %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.access.location.types !2
+
+ !0 = !{i32 3}
----------------
pasaulais wrote:
For better clarity, I think it would help to have comments translating the bits to the actual values in documentation examples and lit tests. For example:
`!0 = !{i32 3} ; no_fine_grained_access | no_remote_access`
In general I think it would be more clear to use strings in the MD nodes (I am thinking of someone looking at the IR produced by the compiler and not knowing what the numbers mean without looking them up), but I am aware that would take away the simplicity of the current design.
https://github.com/llvm/llvm-project/pull/85052
More information about the llvm-commits
mailing list