[PATCH] D66071: [X86] Teach lowerV4I32Shuffle to only use broadcasts if the mask has more than one undef element. Prioritize shifts over broadcast in lowerV8I16Shuffle.

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 12 05:25:55 PDT 2019


RKSimon added a comment.

Is this going to interfere with folding AVX512 broadcast loads into an instruction at all?

More generally, broadcast is preferable if the input is a foldable load (immediate shifts can't fold), but I think combineX86ShuffleChain should handle this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66071/new/

https://reviews.llvm.org/D66071





More information about the llvm-commits mailing list