[PATCH] D69044: [X86] Allow up to 4 loads per inline memcmp()
    Sanjay Patel via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Fri Oct 18 05:19:30 PDT 2019
    
    
  
spatel added a subscriber: courbet.
spatel added a comment.
Herald added a subscriber: wuzish.
Can we make the x86 change to combineVectorSizedSetCCEquality() independently and before the change to TargetLowering?
Given that the previous attempt was reverted because of perf only it would be good to show some perf data here in the proposal. Micro-benchmark or more substantial. cc'ing @courbet in case there's already a test harness in place for that.
I haven't looked at this in a while, so I wonder if we now have the infrastructure within memcmp expansion to create the partial vector code with 'ptest' shown here:
https://bugs.llvm.org/show_bug.cgi?id=33914
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69044/new/
https://reviews.llvm.org/D69044
    
    
More information about the llvm-commits
mailing list