[llvm] [AMDGPU] Change default loop alignment (PR #155343)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 26 13:19:34 PDT 2025
rampitec wrote:
> [triton-aiter.xlsx](https://github.com/user-attachments/files/21994782/triton-aiter.xlsx) Triton Aiter results, especially for unbatched Llama, see a substantial performance hit
>
> I see three options
>
> * Warn the user when they have sub-optimal alignment and give a flag to user (though no easy for user to know whether the flag will help upfront - as alignment can have side effects and can be different impact on different architectures. Saw some cases where nops in unexecuted code paths changed alignment of subsequent executed branch targets hurting performance) (this would also mean we would introduce yet another backend flag)
>
> * Abandon the PR
>
>
> Would like reviewer suggestions
The hit is really big, I do not think we can afford it.
Also the current code mentions that pre-gfx10 targets do not benefit from the alignment, but affected.
https://github.com/llvm/llvm-project/pull/155343
More information about the llvm-commits
mailing list