[PATCH] D69044: [X86] Allow up to 4 loads per inline memcmp()
David Zarzycki via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 5 07:32:59 PST 2019
davezarzycki added a comment.
In D69044#1733886 <https://reviews.llvm.org/D69044#1733886>, @courbet wrote:
> In D69044#1728571 <https://reviews.llvm.org/D69044#1728571>, @courbet wrote:
>
> > I don't remember cases where we had very large constant compares (though we do have quite a lot of small ones). I'll run our internal benchmarks with this change.
>
>
> I've ran our benchmarks, I see no improvement from the change.
Thanks. I think we're almost ready to close this. Do your benchmarks test pre-SSE2 CPUs? In particular 32-bit CPUs? Otherwise, as long as SSE vector registers are available, two load pairs covers the majority inline memcmp scenarios.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69044/new/
https://reviews.llvm.org/D69044
More information about the llvm-commits
mailing list