[PATCH] D69044: [X86] Allow up to 4 loads per inline memcmp()
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 18 05:19:30 PDT 2019
spatel added a subscriber: courbet.
spatel added a comment.
Herald added a subscriber: wuzish.
Can we make the x86 change to combineVectorSizedSetCCEquality() independently and before the change to TargetLowering?
Given that the previous attempt was reverted because of perf only it would be good to show some perf data here in the proposal. Micro-benchmark or more substantial. cc'ing @courbet in case there's already a test harness in place for that.
I haven't looked at this in a while, so I wonder if we now have the infrastructure within memcmp expansion to create the partial vector code with 'ptest' shown here:
https://bugs.llvm.org/show_bug.cgi?id=33914
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69044/new/
https://reviews.llvm.org/D69044
More information about the llvm-commits
mailing list