<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Missed optimization: Inlining behavior can lead to larger code with -Oz than other optimization levels"
href="https://bugs.llvm.org/show_bug.cgi?id=46681">46681</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Missed optimization: Inlining behavior can lead to larger code with -Oz than other optimization levels
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Interprocedural Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>vlad@tsyrklevich.net
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>The following example demonstrates this problem. Note that the call to f(a)
with an uninitialized value is somewhat nonsensical--I blame creduce for that:
__attribute__((__noreturn__)) void assert();
void* d(void *e) { return e; }
void f(void *g) {
if (g)
assert();
}
static void compiles_to_ret() {
void *a;
f(a);
}
int example() {
typedef void (*i)();
i j = d(compiles_to_ret);
j();
return 2;
}
When compiled with clang -arch arm64 -Oz, example is compiled to:
0000000000000018 _example:
18: fd 7b bf a9 stp x29, x30, [sp, #-16]!
1c: fd 03 00 91 mov x29, sp
20: 00 00 00 94 bl _compiles_to_ret
24: 40 00 80 52 mov w0, #2
28: fd 7b c1 a8 ldp x29, x30, [sp], #16
2c: c0 03 5f d6 ret
0000000000000030 _compiles_to_ret:
30: c0 03 5f d6 ret
With -Os, it compiles to:
0000000000000018 _example:
18: 40 00 80 52 mov w0, #2
1c: c0 03 5f d6 ret
Using -mllvm -inline-threshold=10 fixes this trivial example; however, in the
original source code, which I can't share, it takes an inline threshold of over
100 to cause compiles_to_ret to be inlined. The following fixes the issue by
running the optimization pipeline twice:
$ clang -arch arm64 ex.c -Oz -S -emit-llvm -o ex.ll
$ clang -arch arm64 ex.ll -Oz -c ex.ll -o ex.o</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>