[llvm-bugs] [Bug 25899] New: Loads and Stores are not always coalesced
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun Dec 20 01:21:28 PST 2015
https://llvm.org/bugs/show_bug.cgi?id=25899
Bug ID: 25899
Summary: Loads and Stores are not always coalesced
Product: libraries
Version: 3.7
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: haneef503 at gmail.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
Clang (llvm?) sometimes generates inefficient code for loads and stores, but
recognizes that *the same code* can be optimized into fewer loads/stores at
different times. For example, take this simple code:
```
#include <stdint.h>
int l32 (const uint8_t *b) {
int r = 0;
r ^= b[0];
r ^= b[1] << 8;
r ^= b[2] << 16;
r ^= b[3] << 24;
return r;
}
int f (int a) {
return l32 ((void *) &a);
}
```
`clang -O2` generates (clang 3.7, intel syntax, extraneous contents removed):
l32:
movzx eax, byte ptr [rdi]
movzx ecx, byte ptr [rdi + 1]
shl ecx, 8
or ecx, eax
movzx edx, byte ptr [rdi + 2]
shl edx, 16
or edx, ecx
movzx eax, byte ptr [rdi + 3]
shl eax, 24
or eax, edx
ret
f:
mov eax, edi
ret
If it was able to optimize f() to a simple register move, it must have
recognized that the loads could be coalesced into a single load (or that a
little endian load was being compiled for an architecture that happened to be
little endian). Hence, it really is quite odd that it didn't perform the same
optimization and reduce l32() to something more like:
l32:
mov eax, dword ptr [rdi]
ret
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151220/1db5e0c3/attachment-0001.html>
More information about the llvm-bugs
mailing list