[Mlir-commits] [mlir] [MLIR][XeGPU] Extend SGMapAttr and Add ConvertLayoutOp (PR #132425)
Adam Siemieniuk
llvmlistbot at llvm.org
Thu Apr 3 04:59:57 PDT 2025
================
@@ -154,33 +154,111 @@ def XeGPU_FenceScopeAttr:
let assemblyFormat = "$value";
}
-def XeGPU_SGMapAttr : XeGPUAttr<"SGMap", "sg_map"> {
+def XeGPU_LayoutAttr : XeGPUAttr<"Layout", "layout"> {
let summary = [{
- Describes the mapping between work item (WI) and the 2D tensor specified by the tensor descriptor.
+ Describes the data distribution to subgroups and work-items for a tensor
+ specified by the tensor descriptor.
}];
let description = [{
- To distribute the XeGPU operation to work items, the tensor_desc must be specified with the sg_map
- attribute at the tensor description creation time.
- Within the `sg_map`, `wi_layout` specifies the layout of work items,
- describing the mapping of work items to the tensor.
- wi_layout[0] x wi_layout[1] must be equal to the total number of work items within a subgroup.
- `wi_data` specifies the minimum number of data elements assigned to each work item for a single distribution.
-
- E.g., #xegpu.sg_map<wi_layout = [1, 16], wi_data = [1, 1]>
- In this example, the subgroup has 16 work items in wi_layout=[1, 16],
- each accessing 1 element as specified by wi_data=[1, 1].
-
- `wi_data[0] * wi_data[1]` can be greater than 1, meaning that each work item operates on multiple elements,
- which is eventually lowered to "SIMT-flavor" vector, like SPIR-V vector or llvm vector, or packed to a storage data type.
- The multiple elements indicated by `wi_data` can only be from one dimension and must be contiguous in the memory along either dimension.
+ XeGPU operations use `LayoutAttr` to define how data is distributed across subgroups and work-items.
+ This attribute is specified in tensor descriptors during tensor description creation. `LayoutAttr`
+ includes the following parameters:
+
+ * `sg_layout`: Specifies the total number of subgroups and their layout within a workgroup.
+ It is mandatory for workgroup-level programming and optional for subgroup programming. Its
+ presence implies workgroup-level code.
+ * `sg_data`: Defines the data size accessed per subgroup. It is optionally used with `sg_layout`
+ for workgroup-level programming. When it is left empty, the size accessed per subgroup can be
+ derived from the tensor shape and `sg_layout` using the formula:
+ `sg_data[i] = tensor_shape[i] / sg_layout[i]`.
+ * `inst_data`: Specifies the data size that is processed by an instruction. It is optionally
+ used with lane_layout. When it is left empty, the data size per instruction is equivalent to
+ the sg_data for workgroup-level programming or equivalent to tensor shape for subgroup-level
+ programming.
+ * `lane_layout` : Specifies the total number of work-items and their arrangement within a subgroup.
+ It is mandatory for subgroup-level programming and optional for workgroup-level programming.
+ * `lane_data` : Specifies the shape of the tensor fragment that each lane accesses. It defines a single,
+ minimal distribution unit. Processing the entire tensor may require one or more distribution units per
+ hardware instruction.
+ * `order`: Specifies the dimension order used to linearize n-dimensional sg_layout and lane_layout to
+ 1-dimensional layout. The first dimension in the order list is the fastest-changing dimension. If it
+ is not present, the default value is [1, 0].
+
+ ### Examples:
+ 1. Subgroup level layout:
+ ```mlir
+ #xegpu.layout<lane_layout = [2, 8], lane_data = [1, 1]>
+ ```
+ In this example, there are 16 work-items per subgroup, and is organized as
+ [[0, 1, 2, .., 7],[8, 9, .., 15]]. The distribution unit is 1x1.
+
+ 2. Subgroup level layout with order:
+ ```mlir
+ #xegpu.layout<lane_layout = [2, 8], lane_data = [1, 1], order = [0, 1]>
+ ```
+ In this example, there are 16 work-items per subgroup, and is organized as
+ [[0, 2, 4, ..., 14], [1, 3, 5, ..., 15]]. The distribution unit is 1x1.
+
+ 3. Workgroup level layout:
+ ```mlir
+ #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], lane_layout = [2, 8], lane_data = [1, 1]>
+ ```
+ In this example, the layout represents a workgroup distribution. A workgroup consists of 8 subgroups
+ arranged as [[0, 1, 2, 3], [4, 5, 6, 7]]. Each subgroup accesses a 16x16 block per instruction, which
+ is further distributed to 16 work items which is organized as [[0, 1, 2, .., 7],[8, 9, .., 15]].
+
+ 4. Workgroup level layout with order:
+ ```mlir
+ #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], lane_layout = [2, 8], lane_data = [1, 1], order = [0, 1]>
+ ```
+ In this example, the layout represents a workgroup distribution. A workgroup consists of 8 subgroups
+ arranged as [[0, 2, 4, 6], [1, 3, 5, 7]]. Each subgroup accesses a 16x16 block per instruction, which
+ is further distributed to 16 work items which is organized as [[0, 2, 4, ..., 14], [1, 3, 5, ..., 15]].
+
}];
+
let parameters = (ins
- ArrayRefParameter<"uint32_t">:$wi_layout,
- ArrayRefParameter<"uint32_t">:$wi_data
+ OptionalParameter<"DenseI32ArrayAttr">: $sg_layout,
+ OptionalParameter<"DenseI32ArrayAttr">: $sg_data,
+ OptionalParameter<"DenseI32ArrayAttr">: $inst_data,
+ OptionalParameter<"DenseI32ArrayAttr">: $lane_layout,
+ OptionalParameter<"DenseI32ArrayAttr">: $lane_data,
+ OptionalParameter<"DenseI32ArrayAttr">: $order
);
+ let builders = [
+ AttrBuilder<(ins "llvm::ArrayRef<int>": $lane_layout,
+ "llvm::ArrayRef<int>": $lane_data),
+ [{
+ auto sg_layout = DenseI32ArrayAttr();
+ auto sg_data = DenseI32ArrayAttr();
+ auto inst_data = DenseI32ArrayAttr();
+ auto order = DenseI32ArrayAttr();
+ return $_get($_ctxt, sg_layout, sg_data, inst_data,
+ DenseI32ArrayAttr::get($_ctxt, lane_layout),
+ DenseI32ArrayAttr::get($_ctxt, lane_data), order);
+ }]>
+ ];
+
+ let extraClassDeclaration = [{
+ bool isWgLayout() {
+ return getSgLayout() != nullptr;
----------------
adam-smnk wrote:
I think it's more common and generic to check if attribute is not null
https://github.com/llvm/llvm-project/pull/132425
More information about the Mlir-commits
mailing list