[llvm-bugs] [Bug 38103] New: Bad for loop copy optimization when using -fno-builtin-memcpy -fno-builtin-memmove
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Jul 9 06:44:48 PDT 2018
https://bugs.llvm.org/show_bug.cgi?id=38103
Bug ID: 38103
Summary: Bad for loop copy optimization when using
-fno-builtin-memcpy -fno-builtin-memmove
Product: clang
Version: 6.0
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: C++
Assignee: unassignedclangbugs at nondot.org
Reporter: gchatelet at google.com
CC: dgregor at apple.com, llvm-bugs at lists.llvm.org
Straightforward copy using a for loop with known size at compile time leads to
very poor assembly.
> #include <cstddef>
>
> template <size_t kBlockSize>
> void Copy(char* __restrict dst, const char* __restrict src) {
> for (size_t i = 0; i < kBlockSize; ++i) dst[i] = src[i];
> }
>
> template void Copy<15>(char* __restrict dst, const char* __restrict src);
https://godbolt.org/g/YFq3o6
This can be mitigated by the introduction of a temporary buffer like so:
> template <size_t kBlockSize>
> void Copy(char* __restrict dst, const char* __restrict src) {
> char tmp[kBlockSize];
> for (size_t i = 0; i < kBlockSize; ++i) tmp[i] = src[i];
> for (size_t i = 0; i < kBlockSize; ++i) dst[i] = tmp[i];
> }
https://godbolt.org/g/48Dghk
It works up to 25B and produce bad code from 26B onwards.
Check the resulting code for 32B for instance: https://godbolt.org/g/jZwcrv
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180709/21b87799/attachment.html>
More information about the llvm-bugs
mailing list