[PATCH] D69044: [X86] Allow up to 4 loads per inline memcmp()

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 18 05:19:30 PDT 2019


spatel added a subscriber: courbet.
spatel added a comment.
Herald added a subscriber: wuzish.

Can we make the x86 change to combineVectorSizedSetCCEquality() independently and before the change to TargetLowering?

Given that the previous attempt was reverted because of perf only it would be good to show some perf data here in the proposal. Micro-benchmark or more substantial. cc'ing @courbet in case there's already a test harness in place for that.

I haven't looked at this in a while, so I wonder if we now have the infrastructure within memcmp expansion to create the partial vector code with 'ptest' shown here:
https://bugs.llvm.org/show_bug.cgi?id=33914


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69044/new/

https://reviews.llvm.org/D69044





More information about the llvm-commits mailing list