[llvm-branch-commits] [llvm] [LowerMemIntrinsics][AMDGPU] Optimize memset.pattern lowering (PR #185901)

Fabian Ritter via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Thu Mar 12 05:54:38 PDT 2026


ritter-x2a wrote:

@krzysz00  I've added some tests with memset.pattern on AS7 at the end of `test/CodeGen/AMDGPU/memset-pattern.ll`.
The result looks pretty bad to me, especially in the last case, not sure what's happening there.
The memset.pattern lowering should produce a 64xi32 store in AS7 there, for it to be lowered into more convenient sizes by the backend.
It looks like that extra-wide store is lowered into a lot of loops, each one starting with 4 `v_readfirstlane`s, is that due to the buffer fat pointer lowering?
If so, we might want to specify a narrower type in `GCNTTIImpl::getMemcpyLoopLoweringType` for AS7.

https://github.com/llvm/llvm-project/pull/185901


More information about the llvm-branch-commits mailing list