[llvm-branch-commits] [llvm] [AMDGPU] Add amdgcn.av.global.(load|store).b128 intrinsics (PR #191390)
Shilei Tian via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Sun Apr 12 11:49:43 PDT 2026
================
@@ -1775,6 +1775,111 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
* :ref:`Synchronization Scope<amdgpu-intrinsics-syncscope-metadata-operand>`.
Note that the scope used must ensure that the L2 cache will be hit.
+ llvm.amdgcn.av.global.load.b128 This intrinsic is supported on gfx9, gfx10, gfx11, and gfx12 targets.
+
+ Signature:
+
+ .. code-block:: llvm
+
+ <4 x i32> @llvm.amdgcn.av.global.load.b128(
+ ptr addrspace(1), ; source
+ metadata) ; scope - e.g. '!0' where '!0 = !{!"wavegroup"}'
+
+ Reads the value from the source address with cache behavior specified by the scope.
+
+ The following table shows the mapping between valid scope values and target
+ instruction flags or field values.
+
+ ============== ========================== ========================== ========================== ========================== ==========================
+ targets instruction ``"wavefront"`` ``"workgroup"`` ``"agent"`` ``""`` (empty string)
+ ============== ========================== ========================== ========================== ========================== ==========================
+ gfx90* ``global_load_dwordx4`` ``glc`` ``glc``
+
+ gfx942, gfx950 ``global_load_dwordx4`` (wave) ``sc0`` (group) ``sc1`` (device) ``sc0 sc1`` (system)
+
+ gfx10* ``global_load_dwordx4`` ``glc`` ``glc dlc`` ``glc dlc``
+
+ gfx11* ``global_load_dwordx4`` ``glc`` ``glc`` ``glc``
+
+ gfx120* ``av_global_load_b128`` (CU) ``scope:SCOPE_SE`` (SE) ``scope:SCOPE_DEV`` (DEV) ``scope:SCOPE_SYS`` (SYS)
----------------
shiltian wrote:
The alignment here and below for `(CU)` might be off.
https://github.com/llvm/llvm-project/pull/191390
More information about the llvm-branch-commits
mailing list