[cfe-dev] Optimizing returning a struct instance larger than the quadword
Denis Sukhonin via cfe-dev
cfe-dev at lists.llvm.org
Tue Jan 23 03:10:52 PST 2018
Hi list,
I am doing a research on my own trying to understand the best way to
report an error from a function.
My environment:
* exceptions are disabled with -fno-exceptions,
* x86_64 Clang 5.0.1,
* C++17, and
* macOS 10.12.
Suppose, there is a function which can fail: `FailingFn’. As a simple
and quite common solution, I could give it this signature: `bool
FaillingFn(Args…, Error &err)’. This way I have the bool variable
returned via a register, and err object allocated on the stack. So, I
can avoid accessing memory if return bool is true (meaning success).
However, since we have got C++17 with copy elision and structure
binding I want to simplify the signature to `std::tuple<bool, Error>
FaillingFn(Args…)’, or even `RetValue FaillingFn(Args…)’. Then I can
handle errors this way
if (auto [ok, err] = FailingFn(args…); !ok)
// Handle error or, perhaps, just return it.
return err;
Looks more expressive. With a few changes, we can make the `err’
object a variant and carry a result of the successful evaluation.
Though for the sake of simplicity, let’s assume it carries error
information, e.g., a `std::string’ which is obviously larger than a
64bit register.
I am hoping gcc recognizes the case of copy elision, construct the
error object in the caller, and pass it as a reference. Also, having
the returned tuple broken into two independent variables would permit
the compiler to use registers for them, at least for the first one
which fits a register (I do not care about the second one until it
carries a payload though.)
Here below a sample. Assume we have a structure and some functions
that may fail:
template <typename T>
struct Pair {
bool ok;
T value;
};
Pair<std::int64_t> FuncInt(bool const toFail) {
return {!toFail, 42};
}
Pair<std::string> FuncString(bool const toFail) {
return {!toFail, "DEADBEEF"};
}
auto UseInt(bool const flag) {
if (auto const [ok, value] = FuncInt(flag); ok) {
return value;
}
return -1L;
}
auto UseString(bool const flag) {
if (auto const [ok, value] = FuncString(flag); ok) {
return value;
}
return std::string{"DEADFA11"};
}
The optimization works with `FuncInt’: the ok and value have got to
`eax` and `edx’. The check code compiles into:
call FuncInt(bool)
and al, 1
But, in case of `FuncString’, it is not happening. Here is what I get:
call FuncString[abi:cxx11](bool)
cmp byte ptr [rsp], 0
which obviously compares against memory. My intention is to avoid this
redundant read from memory and use a register instead like in
`FuncInt’.
Is there a way to tell clang that instances of Pair should (or can at
least) be broken into two separate variables and returned with the
most efficient way?
I believe this optimization won't work perfectly with current ABI,
though the compiler should not limit itself to the spec if a call is
happening to a non-exposed function or `-flto’ is used.
Here is a sample with assembly: https://godbolt.org/g/EKUqaF
In other words, I want `Pair<std::string> FuncString(bool const)’ to
behave like `bool FuncString(bool const, std::string &value)’
utilizing expressiveness of C++17 including "Structured binding."
As I understand, this problem is very similar to "Scalar replacement
of aggregates," but playing around with optimizer's options didn't
give me any positive outcome or insight.
I don't have any specific requirements regarding the target OS and
architecture beside it is x86. If it is possible to get it done in a
generic way: I’m happy to know; if it works only in the very specific
environment: still glad to know. Perhaps, I can try to implement this
optimization with your help if it looks interesting.
--
Best regards,
Denis Sukhonin
More information about the cfe-dev
mailing list