[llvm] AMDGPU: Add description for amdgpu.no.access.location.types metadata (PR #85052)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 15 08:05:09 PDT 2024


https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/85052

>From 2cc2dd648782dd43fe21969f887f28751a5591b3 Mon Sep 17 00:00:00 2001
From: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: Wed, 13 Mar 2024 14:19:33 +0530
Subject: [PATCH 1/4] AMDGPU: Don't use table for metadata docs, and fix
 section headers

I couldn't figure out how to nicely embed a table within a table column.
Copy the formatting that LangRef uses for metadata, and introduce a
metadata section with subsections for each item. Also fix using subsection
markers in place of section markers to avoid sphinx errors.
---
 llvm/docs/AMDGPUUsage.rst | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index fd9ad7fac19a95..fe37e85c2a40a6 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1312,24 +1312,30 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
 
    List AMDGPU intrinsics.
 
+.. _amdgpu_metadata:
+
 LLVM IR Metadata
-------------------
+================
+
+The AMDGPU backend implements the following target custom LLVM IR
+metadata.
+
+.. _amdgpu_last_use:
 
-The AMDGPU backend implements the following LLVM IR metadata.
+'``amdgpu.last.use``' Metadata
+------------------------------
+
+Sets TH_LOAD_LU temporal hint on load instructions that support it.
+Takes priority over nontemporal hint (TH_LOAD_NT). This takes no
+arguments.
+
+.. code-block:: llvm
 
-.. list-table:: AMDGPU LLVM IR Metatdata
-  :name: amdgpu-llvm-ir-metadata-table
+  %val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
 
-  * - Metadata Name
-    - Description
-    - Values
-  * - !amdgpu.last.use
-    - Sets TH_LOAD_LU temporal hint on load instructions that support it.
-      Takes priority over nontemporal hint (TH_LOAD_NT).
-    - {}
 
 LLVM IR Attributes
-------------------
+==================
 
 The AMDGPU backend supports the following LLVM IR attributes.
 
@@ -1451,7 +1457,7 @@ The AMDGPU backend supports the following LLVM IR attributes.
      ======================================= ==========================================================
 
 Calling Conventions
--------------------
+===================
 
 The AMDGPU backend supports the following calling conventions:
 

>From 29794bc1bdc50a7d06ce3a62ad95b4800f631650 Mon Sep 17 00:00:00 2001
From: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: Wed, 13 Mar 2024 13:08:48 +0530
Subject: [PATCH 2/4] AMDGPU: Add description for
 amdgpu.no.access.location.types metadata

Add a spec for yet-to-be-implemented metadata to allow the backend to
fully handle atomicrmw lowering. This is the base of an alternative
to #69229, which inverts the direction to be correct by default, and
extends to cover the peer device case.

Could use a better name
---
 llvm/docs/AMDGPUUsage.rst  | 43 ++++++++++++++++++++++++++++++++++++++
 llvm/docs/ReleaseNotes.rst |  2 ++
 2 files changed, 45 insertions(+)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index fe37e85c2a40a6..a6556bbd1752b9 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1333,6 +1333,49 @@ arguments.
 
   %val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
 
+.. _amdgpu_no_access_location_types:
+
+'``amdgpu.no.access.location.types``' Metadata
+----------------------------------------------
+
+Asserts a memory access does not access bytes residing in certain
+allocation kinds. This is intended for use with :ref:`atomicrmw
+<i_atomicrmw>` and other atomic instructions. This is required to emit
+a native hardware instruction for some :ref:`system scope
+<amdgpu-memory-scopes>` atomic operations on some subtargets. An
+:ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
+conservatively as required to preserve the operation behavior in all
+cases.
+
+If the memory operation does access an address in an indicated region,
+any stored values and any returned results are :ref:`poison
+<poisonvalues>`. This has a single integer argument, interpreted as a
+bitfield. A 0 value is equivalent to removing the metadata.
+
+.. list-table::
+
+  * - Bit
+    - Description
+  * - 0
+    - Not in fine-grained host memory.
+  * - 1
+    - Not in a remote connected peer device (address must be device local)
+
+.. code-block:: llvm
+
+  ; Indicates the access does not access fine-grained memory, or
+  ; remote device memory.
+  %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.access.location.types !0
+
+  ; Indicates the access does not access fine-grained memory.
+  %old1 = atomicrmw sub ptr %ptr1, i32 1 acquire, !amdgpu.no.access.location.types !1
+
+  ; Indicates the access does not access peer device memory.
+  %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.access.location.types !2
+
+  !0 = !{i32 3}
+  !1 = !{i32 1}
+  !2 = !{i32 2}
 
 LLVM IR Attributes
 ==================
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index b34a5f31c5eb0a..95ebbb74fbbd7f 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -71,6 +71,8 @@ Changes to the AMDGPU Backend
 -----------------------------
 
 * Implemented the ``llvm.get.fpenv`` and ``llvm.set.fpenv`` intrinsics.
+* Added ``!amdgpu.no.access.location.types`` metadata to control
+  atomic behavior.
 
 Changes to the ARM Backend
 --------------------------

>From 61553035b313eeb37681aa16d93eb008269f5734 Mon Sep 17 00:00:00 2001
From: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: Mon, 15 Apr 2024 12:36:19 +0200
Subject: [PATCH 3/4] Add comments to metadata examples

---
 llvm/docs/AMDGPUUsage.rst | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 997d9d71e0ce82..0375812ec63ca1 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1373,9 +1373,10 @@ bitfield. A 0 value is equivalent to removing the metadata.
   ; Indicates the access does not access peer device memory.
   %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.access.location.types !2
 
-  !0 = !{i32 3}
-  !1 = !{i32 1}
-  !2 = !{i32 2}
+  !0 = !{i32 3} ; no_fine_grained_memory_access | no_remote_memory_access
+  !1 = !{i32 1} ; no_fine_grained_memory_access
+  !2 = !{i32 2} ; no_remote_memory_access
+
 
 LLVM IR Attributes
 ==================

>From 4c5c29faf0f5320e0df4b96c374b3d9f706678ec Mon Sep 17 00:00:00 2001
From: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: Mon, 15 Apr 2024 16:53:00 +0200
Subject: [PATCH 4/4] Split into separate metadata components

---
 llvm/docs/AMDGPUUsage.rst | 66 ++++++++++++++++++++++-----------------
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 0375812ec63ca1..191948a2ce6b66 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1333,50 +1333,58 @@ arguments.
 
   %val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
 
-.. _amdgpu_no_access_location_types:
+.. _amdgpu_no_fine_grained_host_memory:
 
-'``amdgpu.no.access.location.types``' Metadata
-----------------------------------------------
+'``amdgpu.no.fine.grained.host.memory``' Metadata
+-------------------------------------------------
 
-Asserts a memory access does not access bytes residing in certain
-allocation kinds. This is intended for use with :ref:`atomicrmw
-<i_atomicrmw>` and other atomic instructions. This is required to emit
-a native hardware instruction for some :ref:`system scope
-<amdgpu-memory-scopes>` atomic operations on some subtargets. An
+Asserts a memory access does not access bytes allocated in fine
+grained allocated host memory. This is intended for use with
+:ref:`atomicrmw <i_atomicrmw>` and other atomic instructions. This is
+required to emit a native hardware instruction for some :ref:`system
+scope <amdgpu-memory-scopes>` atomic operations on some subtargets. An
 :ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
 conservatively as required to preserve the operation behavior in all
-cases.
+cases. This will typically be used in conjunction with
+:ref:`!amdgpu.no.remote.memory.access<amdgpu_no_remote_memory_access>`.
 
-If the memory operation does access an address in an indicated region,
-any stored values and any returned results are :ref:`poison
-<poisonvalues>`. This has a single integer argument, interpreted as a
-bitfield. A 0 value is equivalent to removing the metadata.
+.. code-block:: llvm
+
+  ; Indicates the access does not access fine-grained memory, or
+  ; remote device memory.
+  %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.host.memory !0, !amdgpu.no.remote.memory.access !0
+
+  ; Indicates the access does not access peer device memory.
+  %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.fine.grained.host.memory !0
 
-.. list-table::
+  !0 = !{}
+
+.. _amdgpu_no_remote_memory_access:
+
+'``amdgpu.no.remote.memory.access``' Metadata
+---------------------------------------------
+
+Asserts a memory access does not access bytes in remote connected peer
+device memory (the device address must be device local). This is
+intended for use with :ref:`atomicrmw <i_atomicrmw>` and other atomic
+instructions. This is required to emit a native hardware instruction
+for some :ref:`system scope <amdgpu-memory-scopes>` atomic operations
+on some subtargets. An :ref:`atomicrmw <i_atomicrmw>` without metadata
+will be treated conservatively as required to preserve the operation
+behavior in all cases. This will typically be used in conjunction with
+:ref:`!amdgpu.no.fine.grained.host.memory<amdgpu_no_fine_grained_host_memory>`.
 
-  * - Bit
-    - Description
-  * - 0
-    - Not in fine-grained host memory.
-  * - 1
-    - Not in a remote connected peer device (address must be device local)
 
 .. code-block:: llvm
 
   ; Indicates the access does not access fine-grained memory, or
   ; remote device memory.
-  %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.access.location.types !0
-
-  ; Indicates the access does not access fine-grained memory.
-  %old1 = atomicrmw sub ptr %ptr1, i32 1 acquire, !amdgpu.no.access.location.types !1
+  %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.host.memory !0, !amdgpu.no.remote.memory.access !0
 
   ; Indicates the access does not access peer device memory.
-  %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.access.location.types !2
-
-  !0 = !{i32 3} ; no_fine_grained_memory_access | no_remote_memory_access
-  !1 = !{i32 1} ; no_fine_grained_memory_access
-  !2 = !{i32 2} ; no_remote_memory_access
+  %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.remote.memory.access !0
 
+  !0 = !{}
 
 LLVM IR Attributes
 ==================



More information about the llvm-commits mailing list