[llvm] [AMDGPU] Update AMDGPUUsage.rst to document two intrinsics (PR #123816)

Fri Jan 24 12:00:49 PST 2025

https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/123816

>From 77b26395989269ef10807ec5d85ad97e9f79f75d Mon Sep 17 00:00:00 2001
From: Jun Wang <jwang86 at yahoo.com>
Date: Tue, 21 Jan 2025 12:48:49 -0800
Subject: [PATCH 1/3] [AMDGPU] Update AMDGPUUsage.rst to document two
 intrinsics

The AMDGPUUsage.rst file is updated to document tow intrinsics:
llvm.amdgcn.move.dpp and llvm.amdgcn.update.dpp.
---
 llvm/docs/AMDGPUUsage.rst | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 40b393224f15dd..132a7444805620 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1422,6 +1422,18 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
                                                    Returns a pair for the swapped registers. The first element of the return
                                                    corresponds to the swapped element of the first argument.
 
+  llvm.amdgcn.mov.dpp                              The llvm.amdgcn.mov.dpp.i32 intrinsic represents the mov.dpp operation in AMDGPU.
+                                                   This operation is being deprecated and can be replaced with llvm.amdgcn.update.dpp.
+
+  llvm.amdgcn.update.dpp                           The llvm.amdgcn.update.dpp intrinsic represents the update.dpp operation in AMDGPU.
+                                                   It takes an old value, a source operand, a DPP control operand, a row mask, a bank mask, and a bound control.
+                                                   This operation is equivalent to a sequence of v_mov_b32 operations.
+                                                   It is preferred over llvm.amdgcn.mov.dpp.i32 for future use.
+                                                   `llvm.amdgcn.update.dpp.i32 <old> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`
+                                                   Should be equivalent to:
+                                                   - `v_mov_b32 <dest> <old>`
+                                                   - `v_mov_b32 <dest> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`
+
   ==============================================   ==========================================================
 
 .. TODO::

>From 507e04b3bf0f27a11c964c8eaadc9bca92c92f5f Mon Sep 17 00:00:00 2001
From: Jun Wang <jwang86 at yahoo.com>
Date: Thu, 23 Jan 2025 09:40:08 -0800
Subject: [PATCH 2/3] Mention supported data types.

---
 llvm/docs/AMDGPUUsage.rst | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 132a7444805620..8f09df2406f107 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1422,14 +1422,15 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
                                                    Returns a pair for the swapped registers. The first element of the return
                                                    corresponds to the swapped element of the first argument.
 
-  llvm.amdgcn.mov.dpp                              The llvm.amdgcn.mov.dpp.i32 intrinsic represents the mov.dpp operation in AMDGPU.
+  llvm.amdgcn.mov.dpp                              The llvm.amdgcn.mov.dpp.`<type>` intrinsic represents the mov.dpp operation in AMDGPU.
                                                    This operation is being deprecated and can be replaced with llvm.amdgcn.update.dpp.
 
-  llvm.amdgcn.update.dpp                           The llvm.amdgcn.update.dpp intrinsic represents the update.dpp operation in AMDGPU.
+  llvm.amdgcn.update.dpp                           The llvm.amdgcn.update.dpp.`<type>` intrinsic represents the update.dpp operation in AMDGPU.
                                                    It takes an old value, a source operand, a DPP control operand, a row mask, a bank mask, and a bound control.
+                                                   Various data types are supported, including, bf16, f16, f32, f64, i16, i32, i64, p0, p3, p5, v2f16, v2f32, v2i16, v2i32, v2p0, v3i32, v4i32, v8f16.
                                                    This operation is equivalent to a sequence of v_mov_b32 operations.
-                                                   It is preferred over llvm.amdgcn.mov.dpp.i32 for future use.
-                                                   `llvm.amdgcn.update.dpp.i32 <old> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`
+                                                   It is preferred over llvm.amdgcn.mov.dpp.`<type>` for future use.
+                                                   `llvm.amdgcn.update.dpp.<type> <old> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`
                                                    Should be equivalent to:
                                                    - `v_mov_b32 <dest> <old>`
                                                    - `v_mov_b32 <dest> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`

>From 60ec50171f24c4fef7dedb9d19d03f9d8d5c5fd6 Mon Sep 17 00:00:00 2001
From: Jun Wang <jwang86 at yahoo.com>
Date: Fri, 24 Jan 2025 11:59:49 -0800
Subject: [PATCH 3/3] Update comments in IntrinsicsAMDGPU.td as well.

---
 llvm/include/llvm/IR/IntrinsicsAMDGPU.td | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index b529642a558710..8ef5d78ca2d97f 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -2563,9 +2563,9 @@ def int_amdgcn_buffer_wbinvl1_vol :
 // VI Intrinsics
 //===----------------------------------------------------------------------===//
 
-// The llvm.amdgcn.mov.dpp.i32 intrinsic represents the mov.dpp operation in AMDGPU.
-// This operation is being deprecated and can be replaced with llvm.amdgcn.update.dpp.i32.
-// llvm.amdgcn.mov.dpp.i32 <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>
+// The llvm.amdgcn.mov.dpp.<type> intrinsic represents the mov.dpp operation in AMDGPU.
+// This operation is being deprecated and can be replaced with llvm.amdgcn.update.dpp.<type>.
+// llvm.amdgcn.mov.dpp.<type> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>
 def int_amdgcn_mov_dpp :
   Intrinsic<[llvm_anyint_ty],
             [LLVMMatchType<0>, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty,
@@ -2574,11 +2574,13 @@ def int_amdgcn_mov_dpp :
              ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>,
              ImmArg<ArgIndex<3>>, ImmArg<ArgIndex<4>>, IntrNoCallback, IntrNoFree]>;
 
-// The llvm.amdgcn.update.dpp.i32 intrinsic represents the update.dpp operation in AMDGPU.
+// The llvm.amdgcn.update.dpp.<type> intrinsic represents the update.dpp operation in AMDGPU.
 // It takes an old value, a source operand, a DPP control operand, a row mask, a bank mask, and a bound control.
+// Various data types are supported, including, bf16, f16, f32, f64, i16, i32, i64, p0, p3, p5, v2f16, v2f32,
+// v2i16, v2i32, v2p0, v3i32, v4i32, v8f16.
 // This operation is equivalent to a sequence of v_mov_b32 operations.
-// It is preferred over llvm.amdgcn.mov.dpp.i32 for future use.
-// llvm.amdgcn.update.dpp.i32 <old> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>
+// It is preferred over llvm.amdgcn.mov.dpp.<type> for future use.
+// llvm.amdgcn.update.dpp.<type> <old> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>
 // Should be equivalent to:
 // v_mov_b32 <dest> <old>
 // v_mov_b32 <dest> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>