[PATCH] D69222: [X86] NFC: expand inline memcmp test coverage
David Zarzycki via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 21 02:51:52 PDT 2019
davezarzycki added a comment.
And now that I've had a chance to rebase D69044 <https://reviews.llvm.org/D69044> ("up to four load pairs") on top of this updated test file, I can report that:
1. The 48 and 96 byte memcmps do not improve for AVX2 or AVX512.
2. The AVX1 code gen is relatively reasonable for 48 bytes: three XMM compares. It could have been one YMM compare and one zero extended XMM compare.
I think I figured out why 48 and 96 bytes are awful. It seems that lowering a vector that is the result of a zero extended scalar generates terrible code. Should `combineVectorSizedSetCCEquality` detect the zero extend and create an `ISD::INSERT_SUBVECTOR` node? Or should something more fundamental detect this scenario and create the `ISD::INSERT_SUBVECTOR`?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69222/new/
https://reviews.llvm.org/D69222
More information about the llvm-commits
mailing list