[all-commits] [llvm/llvm-project] 9617b2: [MLIR][XeGPU] Support partial subgroup lane distri...

Sang Ik Lee via All-commits all-commits at lists.llvm.org
Wed Jun 10 17:33:33 PDT 2026


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9617b2af712286cdfc01c11d9054cf8656ff42e7
      https://github.com/llvm/llvm-project/commit/9617b2af712286cdfc01c11d9054cf8656ff42e7
  Author: Sang Ik Lee <sang.ik.lee at intel.com>
  Date:   2026-06-10 (Wed, 10 Jun 2026)

  Changed paths:
    M mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToLaneDistribute.cpp
    M mlir/test/Dialect/XeGPU/sg-to-lane-distribute-unit.mlir

  Log Message:
  -----------
  [MLIR][XeGPU] Support partial subgroup lane distribution  (#201667)

for convert_layout

Add lowering support in XeGPUSgToLaneDistribute for values that are
distributed across only a fraction of the subgroup.

- SgToLaneConvertLayout now lowers a rank-2 xegpu.convert_layout that
  shrinks the lane layout along the outer (distributed) dimension while
  keeping lane_data unchanged (e.g. [16, 1] -> [8, 1]). The partial-subgroup
  case is detected directly in the pattern: equal order, rank 2, unit inner
  lane layout, and a genuinely distributed outer lane layout (> 1, which also
  rules out the degenerate [1, 1] layout). Because the data is no longer
  replicated in every lane, it is gathered across lanes and the distributed
  outer dimension is doubled when the lane count is halved.

- The cross-lane gather is factored into a dedicated helper,
  shuffleDataAsLaneLayoutChange(): it bitcasts the source to i32, issues
  gpu.shuffle up to fetch the values from the dropped lanes, and concatenates
  the lane-local and gathered data with vector.shuffle. Only halving the lane
  count (factor of two), rank-2 vectors, and bit widths that are a multiple
  of 32 are supported; other cases fail the match.

- SgToLaneVectorExtractStridedSlice now adjusts the effective subgroup size
  when the source lane layout along the distributed dimension is smaller than
  the hardware subgroup size, so slice offsets/sizes are scaled correctly
  (e.g. a subgroup-space offset of 8 maps to a distributed offset of 1).

Add a unit test exercising the dpas_mx scale operand path.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list