[llvm] [AMDGPU] Define constrained multi-dword scalar load instructions. (PR #96161)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 20 17:39:39 PDT 2024
rampitec wrote:
I just wish to have some optimization here: most of these loads are from kernarg. We know that kernarg is page aligned (I guess?). We also know a minimal page size and kernarg size. So if kernarg size is no greater than page size, skip it. Or if kernarg is not page aligned, make it page aligned.
> I just wish to have some optimization here: most of these loads are from kernarg. We know that kernarg is page aligned (I guess?). We also know a minimal page size and kernarg size. So if kernarg size is no greater than page size, skip it. Or if kernarg is not page aligned, make it page aligned.
JBTW, with that you will probably end up with exactly zero kernels falling into this category.
https://github.com/llvm/llvm-project/pull/96161
More information about the llvm-commits
mailing list