[Mlir-commits] [mlir] [mlir][amdgpu] Revise AMDGPU dialect DPP documentation (PR #182639)
Eric Feng
llvmlistbot at llvm.org
Fri Feb 20 18:04:53 PST 2026
================
@@ -663,20 +663,93 @@ def AMDGPU_DPPOp : AMDGPU_Op<"dpp",
DefaultValuedAttr<BoolAttr, "false">:$bound_ctrl)> {
let summary = "AMDGPU DPP operation";
let description = [{
- This operation represents DPP functionality in a GPU program.
- DPP provides the following operations:
- - Full crossbar in a group of four (`quad_perm`)
- - Wavefront shift left by one lane (`wave_shl`)
- - Wavefront shift right by one lane (`wave_shr`)
- - Wavefront rotate right by one lane (`wave_ror`)
- - Wavefront rotate left by one lane (`wave_rol`)
- - Row shift left by 1–15 lanes (`row_shl`)
- - Row shift right by 1–15 lanes (`row_shr`)
- - Row rotate right by 1–15 lanes (`row_ror`)
- - Reverse within a row (`row_mirror`)
- - Reverse within a half-row (`row_half_mirror`)
- - Broadcast the 15th lane of each row to the next row (`row_bcast`)
- - Broadcast lane 31 to rows 2 and 3 (`row_bcast`)
+ The `amdgpu.dpp` op performs a Data Parallel Primitives (DPP) lane
+ permutation on a source value within a wavefront. Each lane reads its
+ source data from another lane according to the permutation mode specified
+ by `kind`. DPP operates at dword (32-bit) granularity: sub-32-bit types
+ (e.g., f16, i16) are packed into an i32 during lowering, permuted, and
+ extracted back.
+
+ A Wave64 wavefront has 64 lanes (0-63) organized hierarchically:
----------------
efric wrote:
Thanks, should be addressed.
https://github.com/llvm/llvm-project/pull/182639
More information about the Mlir-commits
mailing list