[all-commits] [llvm/llvm-project] 905f1d: [mlir][AMDGPU] Implement gpu.subgroup_reduce with ...
Muzammil via All-commits
all-commits at lists.llvm.org
Wed Apr 23 17:37:55 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 905f1d8068a5bc1149732b46afc3f5dd780aa5d9
https://github.com/llvm/llvm-project/commit/905f1d8068a5bc1149732b46afc3f5dd780aa5d9
Author: Muzammil <55665739+Muzammiluddin-Syed-ECE at users.noreply.github.com>
Date: 2025-04-23 (Wed, 23 Apr 2025)
Changed paths:
M mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
M mlir/lib/Dialect/GPU/Transforms/SubgroupReduceLowering.cpp
M mlir/test/Dialect/GPU/subgroup-reduce-lowering.mlir
M mlir/test/lib/Dialect/GPU/TestGpuRewrite.cpp
Log Message:
-----------
[mlir][AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs (#133204)
When performing cross-lane reductions using subgroup_reduce ops across
contiguous lanes on AMD GPUs, lower to Data Parallel Primitives (DPP)
ops when possible. This reduces latency on applicable devices.
See related [Issue](https://github.com/iree-org/iree/issues/20007)
To do:
- Improve lowering to subgroup_reduce in compatible matvecs (these get
directly lowered to gpu.shuffles in an earlier pass)
---------
Signed-off-by: Muzammiluddin Syed <muzasyed at amd.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list