[PATCH] D114316: [X86][Costmodel] Now that `getReplicationShuffleCost()` is good, update `getInterleavedMemoryOpCostAVX512()`

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 29 04:04:45 PST 2021


lebedev.ri added inline comments.


================
Comment at: llvm/lib/Target/X86/X86TargetTransformInfo.cpp:5351-5354
     // About a half of the loads may be folded in shuffles when we have only
     // one result. If we have more than one result, we do not fold loads at all.
     unsigned NumOfUnfoldedLoads =
         NumOfResults > 1 ? NumOfMemOps : NumOfMemOps / 2;
----------------
Hmm, but we can't fold masked load into shuffle, can we?
The mask on the shuffle is for the output, not the input.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114316/new/

https://reviews.llvm.org/D114316



More information about the llvm-commits mailing list