[llvm-bugs] [Bug 45566] New: [coroutines] coroutine frame layout doesn't reuse storage for local variables with non-overlapping lifetimes
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Apr 15 21:52:28 PDT 2020
https://bugs.llvm.org/show_bug.cgi?id=45566
Bug ID: 45566
Summary: [coroutines] coroutine frame layout doesn't reuse
storage for local variables with non-overlapping
lifetimes
Product: clang
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: C++2a
Assignee: unassignedclangbugs at nondot.org
Reporter: lewissbaker at gmail.com
CC: blitzrakete at gmail.com, erik.pilkington at gmail.com,
llvm-bugs at lists.llvm.org, richard-llvm at metafoo.co.uk
The calculation of the coroutine-frame layout does not seem to take advantage
of the fact that lifetimes of some local variables do not overlap and so can
share storage. This leads to larger coroutine frame sizes than is necessary.
For example, see this compiler-explorer example: https://godbolt.org/z/JjRnoQ
This shows the following code (as of clang 10/trunk with -O2 or -O3)
------
#include <cppcoro/task.hpp>
struct big_structure {
big_structure();
char dummy[500];
};
void process(big_structure&);
void process2(big_structure&);
cppcoro::task<void> something();
cppcoro::task<void> simple() {
// 552 byte coroutine frame
big_structure a;
process(a);
co_await something();
}
cppcoro::task<void> alternative_paths(bool cond) {
// 1072 byte coroutine frame
if (cond) {
big_structure a;
process(a);
co_await something();
} else {
big_structure b;
process2(b);
co_await something();
}
}
cppcoro::task<void> consecutive_paths() {
// 1072 byte coroutine frame
{
big_structure a;
process(a);
co_await something();
}
{
big_structure b;
process2(b);
co_await something();
}
}
void normal_function(bool cond) {
// 512 byte stack frame
if (cond) {
big_structure a;
process(a);
} else {
big_structure b;
process2(b);
}
}
-----
The coroutines alternative_paths() and consecutive_paths() both define two
big_structure local variables (500 bytes each).
The lifetime of these variables do not overlap and so the compiler should be
able to reuse the same storage for both
of those variables. However, what we observe with the example above is that the
coroutine frames allocate separate
storage for these two variables, resulting in a coroutine frame size of ~1000
bytes instead of one of ~500 bytes.
Ideally, the same algorithms that optimise the layout and size of stack-frames
would be applied to the coroutine
frame layout calculation to allow it to reuse storage within the coroutine
frame for local variables whose lifetimes
do not overlap.
We can see this behaviour with the normal_function() example, which, despite
having two local variables of size 500 bytes,
only allocates a stack-frame of 512 bytes, as the lifetimes of these local
variables do not overlap.
Implementing this optimisation would reduce the size of heap-allocations
required for coroutine
frames for many non-trivial coroutine functions.
The current logic for calculating the coroutine frame layout is fairly naive.
See
https://github.com/llvm/llvm-project/blob/e13a8a1fc56837e2f21b85b89a445fb4f21500d6/llvm/lib/Transforms/Coroutines/CoroFrame.cpp#L574
It just iteratively adds each spilled alloca as a field in the coroutine-frrame
struct. It does not take into account the lifetime of the variable for each
alloca.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200416/28794709/attachment.html>
More information about the llvm-bugs
mailing list