[PATCH] D66071: [X86] Teach lowerV4I32Shuffle to only use broadcasts if the mask has more than one undef element. Prioritize shifts over broadcast in lowerV8I16Shuffle.
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 12 05:25:55 PDT 2019
RKSimon added a comment.
Is this going to interfere with folding AVX512 broadcast loads into an instruction at all?
More generally, broadcast is preferable if the input is a foldable load (immediate shifts can't fold), but I think combineX86ShuffleChain should handle this.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D66071/new/
https://reviews.llvm.org/D66071
More information about the llvm-commits
mailing list