[llvm] [SystemZ] SLP reductions: cost functions of reductions and scalarization (PR #112491)
Jonas Paulsson via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 7 16:13:58 PST 2024
JonPsson1 wrote:
A slight regression (mcf) has gotten in the way of this patch being committed. It seems that the problem is that when SLP thinks element loads are free, it creates VLREP; VLEG; VST (two element loads and a vector store), out of two separate loads and stores. However, the SystemZ backend prefers MVC (memcopy) even of 8 bytes, so the result is that this sequence replaces two MVCs, and not four instructions.
Not sure how to best avoid this:
- Somehow make SLP avoid these cases where memcpy would be preferred (would that be all load-> store sequences?). New hook like "preferMemCpy()"? There is the TLI->getMaxStoresPerMemcpy() which is kind of similar.
- SystemZ backend could optimize this VST pattern into MVC.
- Make SLP pass the actual Load Instruction pointers to the getScalarizationOverhead() so that the SystemZ implementation can recognize loads used by stores.
https://github.com/llvm/llvm-project/pull/112491
More information about the llvm-commits
mailing list