[llvm] [SystemZ] SLP reductions: cost functions of reductions and scalarization (PR #112491)

Jonas Paulsson via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 7 16:13:58 PST 2024


JonPsson1 wrote:

A slight regression (mcf) has gotten in the way of this patch being committed. It seems that the problem is that when SLP thinks element loads are free, it creates VLREP; VLEG; VST  (two element loads and a vector store), out of two separate loads and stores. However, the SystemZ backend prefers MVC (memcopy) even of 8 bytes, so the result is that this sequence replaces two MVCs, and not four instructions.

Not sure how to best avoid this:
- Somehow make SLP avoid these cases where memcpy would be preferred (would that be all load-> store sequences?). New hook like "preferMemCpy()"? There is the TLI->getMaxStoresPerMemcpy() which is kind of similar.
- SystemZ backend could optimize this VST pattern into MVC.
- Make SLP pass the actual Load Instruction pointers to the  getScalarizationOverhead() so that the SystemZ implementation can recognize loads used by stores.


https://github.com/llvm/llvm-project/pull/112491


More information about the llvm-commits mailing list