[PATCH] D114316: [X86][Costmodel] Now that `getReplicationShuffleCost()` is good, update `getInterleavedMemoryOpCostAVX512()`
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 29 04:04:45 PST 2021
lebedev.ri added inline comments.
================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:5351-5354
// About a half of the loads may be folded in shuffles when we have only
// one result. If we have more than one result, we do not fold loads at all.
unsigned NumOfUnfoldedLoads =
NumOfResults > 1 ? NumOfMemOps : NumOfMemOps / 2;
----------------
Hmm, but we can't fold masked load into shuffle, can we?
The mask on the shuffle is for the output, not the input.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114316/new/
https://reviews.llvm.org/D114316
More information about the llvm-commits
mailing list