[llvm-branch-commits] [llvm] [LowerMemIntrinsics][AMDGPU] Optimize memset.pattern lowering (PR #185901)
Fabian Ritter via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Thu Mar 12 05:54:38 PDT 2026
ritter-x2a wrote:
@krzysz00 I've added some tests with memset.pattern on AS7 at the end of `test/CodeGen/AMDGPU/memset-pattern.ll`.
The result looks pretty bad to me, especially in the last case, not sure what's happening there.
The memset.pattern lowering should produce a 64xi32 store in AS7 there, for it to be lowered into more convenient sizes by the backend.
It looks like that extra-wide store is lowered into a lot of loops, each one starting with 4 `v_readfirstlane`s, is that due to the buffer fat pointer lowering?
If so, we might want to specify a narrower type in `GCNTTIImpl::getMemcpyLoopLoweringType` for AS7.
https://github.com/llvm/llvm-project/pull/185901
More information about the llvm-branch-commits
mailing list