[PATCH] D55263: [CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads.
David Zarzycki via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 10 05:19:03 PST 2018
davezarzycki added a comment.
In D55263#1320043 <https://reviews.llvm.org/D55263#1320043>, @spatel wrote:
> I just looked over the codegen changes so far, but I want to add some more knowledgeable x86 hackers to have a look too. There are 2 concerns:
>
> 1. Are there any known uarch problems with overlapping loads?
> 2. Are there any known uarch problems with unaligned accesses (either scalar or SSE)?
>
> If there's any perf data (either nano-benchmarks or full apps) to support the changes, that would be nice to see. This reminds me of PR33329: https://bugs.llvm.org/show_bug.cgi?id=33329 (can we close that now?)
Let's not lose sight of the big picture here. If uarch problems exist, are they *worse* than the cost of calling memcmp()? In other words, is the likely register spills, function call overhead, and dynamic algorithm selection (given the constantness of the size parameter is lost) worth it?
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D55263/new/
https://reviews.llvm.org/D55263
More information about the llvm-commits
mailing list