[all-commits] [llvm/llvm-project] 905f1d: [mlir][AMDGPU] Implement gpu.subgroup_reduce with ...

Wed Apr 23 17:37:55 PDT 2025

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 905f1d8068a5bc1149732b46afc3f5dd780aa5d9
      https://github.com/llvm/llvm-project/commit/905f1d8068a5bc1149732b46afc3f5dd780aa5d9
  Author: Muzammil <55665739+Muzammiluddin-Syed-ECE at users.noreply.github.com>
  Date:   2025-04-23 (Wed, 23 Apr 2025)

  Changed paths:
    M mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
    M mlir/lib/Dialect/GPU/Transforms/SubgroupReduceLowering.cpp
    M mlir/test/Dialect/GPU/subgroup-reduce-lowering.mlir
    M mlir/test/lib/Dialect/GPU/TestGpuRewrite.cpp

  Log Message:
  -----------
  [mlir][AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs (#133204)

When performing cross-lane reductions using subgroup_reduce ops across
contiguous lanes on AMD GPUs, lower to Data Parallel Primitives (DPP)
ops when possible. This reduces latency on applicable devices.
See related [Issue](https://github.com/iree-org/iree/issues/20007)
To do:
- Improve lowering to subgroup_reduce in compatible matvecs (these get
directly lowered to gpu.shuffles in an earlier pass)

---------

Signed-off-by: Muzammiluddin Syed <muzasyed at amd.com>


To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications