[PATCH] D69044: [X86] Allow up to 4 loads per inline memcmp()

David Zarzycki via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 5 07:32:59 PST 2019


davezarzycki added a comment.

In D69044#1733886 <https://reviews.llvm.org/D69044#1733886>, @courbet wrote:

> In D69044#1728571 <https://reviews.llvm.org/D69044#1728571>, @courbet wrote:
>
> > I don't remember cases where we had very large constant compares (though we do have quite a lot of small ones). I'll run our internal benchmarks with this change.
>
>
> I've ran our benchmarks, I see no improvement from the change.


Thanks. I think we're almost ready to close this. Do your benchmarks test pre-SSE2 CPUs? In particular 32-bit CPUs? Otherwise, as long as SSE vector registers are available, two load pairs covers the majority inline memcmp scenarios.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69044/new/

https://reviews.llvm.org/D69044





More information about the llvm-commits mailing list