[PATCH] D33938: [x86] use vperm2f128 rather than vinsertf128 when there's a chance to fold a 32-byte load
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Jun 11 10:17:03 PDT 2017
spatel added a comment.
A couple of notes for reference:
1. There's another potential case for trying harder to recognize a splat mask in PR32007:
2. I looked at adding a VPERM2X128 case to combineTargetShuffle() that would turn this into X86ISD::SUBV_BROADCAST. It actually produced the expected vbroadcastf128 instruction, but I'm not sure how that matched because I didn't do anything to shrink the loaded value (!).
More information about the llvm-commits