<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - passing a compile-time constant value to a struct { float f[4]; } arg uses two 16B loads instead of two 8B zero-extending loads, wasting space on constants"
href="https://llvm.org/bugs/show_bug.cgi?id=30585">30585</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>passing a compile-time constant value to a struct { float f[4]; } arg uses two 16B loads instead of two 8B zero-extending loads, wasting space on constants
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>3.9
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>peter@cordes.ca
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Clang wastes space on zero padding that could be generated for free by a
narrower load (like gcc uses).
// clang3.9 -O3: <a href="https://godbolt.org/g/9L5HnG">https://godbolt.org/g/9L5HnG</a>
struct foo { float f[4]; };
void ext(struct foo A);
void pass_args() {
struct foo A = { {1, 2, 3, 4} };
ext(A);
}
movaps xmm0, xmmword ptr [rip + .LCPI0_0] # xmm0 = <1,2,u,u>
movaps xmm1, xmmword ptr [rip + .LCPI0_1] # xmm1 = <3,4,u,u>
jmp ext # TAILCALL
.LCPI0_0:
.long 1065353216 # float 1
.long 1073741824 # float 2
.zero 4
.zero 4
... and another 16B vector.
More efficient would be using MOVSD to load 8 bytes and zero the upper half of
the register.
gcc uses MOVQ (and narrower constants), but if any CPUs care about integer vs.
float domains for loads, MOVSD is better. It definitely has no downsides vs.
MOVQ for this purpose: same number of instruction bytes and the same zeroing of
the upper 8 bytes (when the source is a memory operand).
MOVAPS is actually one byte shorter, but that's probably less important than
keeping the constants compact.
Sorry for the long title, maybe it's too specific.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>