[llvm] [AMDGPU] Don't cluster DS-instrs together (PR #180908)
via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 15 06:31:53 PST 2026
lijinpei-amd wrote:
> Can you add more details about the motivation for the PR and what it does?
Sure.
The motavation:
We were analyzing some assembly generated by triton/gluon matmul kernels, and found that if ds_read instrs were not scheduled together, we got 10% perf increase for our case. Further analysis shows that ds_reads are scheduled together because of schedule-dag mutation and SIPostRABundler.
What it does:
The goal of this pr is to remove the cluster-edge created by dag mutation, and remove the bundle created by SIPostRABundler, so that scheduler has more freedom to move lds instrs around.
I have updated the pr description and uploaded the llvm-ir and assembly of my cases here: https://gist.github.com/lijinpei-amd/0d114c50891abe5bebb564e2084d357d .
https://github.com/llvm/llvm-project/pull/180908
More information about the llvm-commits
mailing list