[PATCH] D138751: [MemCpyOpt] Expand two memcpy's with clobber inbetween (PR59116)
Nikita Popov via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 7 07:18:23 PST 2022
nikic added a comment.
At a high level, I'd say that this transform would be a better fit for SROA. The profitability is clearer if we can actually eliminate the alloca and spill from the first memcpy, making this a single load and store.
================
Comment at: llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp:1195
+ const unsigned NeededRegs = divideCeil(8 * NumBytes, RegBitWidth);
+ if (NeededRegs > NumRegs)
+ return false;
----------------
So we want to use up *all* vector registers for the copy? That's like 64 * 32 = 2048 bytes for AVX-512. That seems *way* too aggressive.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138751/new/
https://reviews.llvm.org/D138751
More information about the llvm-commits
mailing list