[cfe-dev] Reaching the end of a value-returning function in C++

Chandler Carruth chandlerc at google.com
Thu Oct 25 15:31:08 PDT 2012


On Thu, Oct 25, 2012 at 3:02 PM, John McCall <rjmccall at apple.com> wrote:
> On Oct 25, 2012, at 2:29 PM, Chandler Carruth wrote:
>> On Thu, Oct 25, 2012 at 2:18 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> On Thu, Oct 25, 2012 at 10:54 AM, Argyrios Kyrtzidis
>>> <kyrtzidis at apple.com> wrote:
>>>>
>>>> On Oct 16, 2012, at 10:15 AM, Argyrios Kyrtzidis <kyrtzidis at apple.com>
>>>> wrote:
>>>>
>>>> On Oct 16, 2012, at 9:04 AM, John McCall <rjmccall at apple.com> wrote:
>>>>
>>>> On Oct 15, 2012, at 10:26 PM, Argyrios Kyrtzidis wrote:
>>>>
>>>> On Oct 15, 2012, at 9:45 PM, John McCall <rjmccall at apple.com> wrote:
>>>>
>>>> On Oct 15, 2012, at 9:34 PM, Richard Smith wrote:
>>>>
>>>> On Mon, Oct 15, 2012 at 11:39 AM, Argyrios Kyrtzidis <kyrtzidis at apple.com>
>>>> wrote:
>>>>>
>>>>> Unless I'm missing something, this will benefit functions that are not
>>>>> checked with -Wreturn-type and are supposed to be unreachable in some path
>>>>> but are not marked as such.
>>>>>
>>>>> I'd prefer that these functions are actually marked as 'unreachable' in
>>>>> source code, instead of depending on the compiler implicitly assuming that
>>>>> in order to get such an optimization.
>>>>
>>>>
>>>> I agree, but if they're not marked 'unreachable' in the source code, what IR
>>>> would you want to produce for code paths which fall off the end?
>>>> @llvm.trap() at -O0 and unreachable otherwise seems reasonable to me; would
>>>> you prefer something else? (Perhaps always emitting a call to @llvm.trap?)
>>>>
>>>>
>>>> FWIW, I endorse using 'unreachable' here outside of -O0.
>>>>
>>>>
>>>> Compared to 'unreachable', I prefer always emitting a call to @llvm.trap.
>>>>
>>>> Please keep in mind that there's debugging and investigation of crash
>>>> reports from -Os/O2 code as well..
>>>> I didn't yet see an argument that there's enough optimization opportunity in
>>>> practical terms to justify the havoc that 'unreachable' will cause with a
>>>> buggy function.
>>>> Valid code is, in reality, going to use 'unreachable' marks and 'noreturn'
>>>> functions, so all we are going to achieve is "speed up" buggy code,
>>>> relinquishing any hope of finding the bug or figuring out what is going on
>>>> in general.
>>>>
>>>>
>>>> Is there a case where we wouldn't actually warn before doing this?  Buggy
>>>> C++ system headers?
>>>>
>>>> John.
>>>>
>>>>
>>>> If you believe that the warning will catch all such bugs, then the
>>>> "optimization" is actually useless (as in "it won't get used").
>>>>
>>>> If, on the other hand you believe that warnings are not panacea and there's
>>>> a chance that such a bug will slip by (already did), which is when the
>>>> 'unreachable' will kick in, then at best we are obfuscating the bug and and
>>>> at worst we are creating a mess.
>>>> Is there a case where this is worth it ?
>>>>
>>>>
>>>> Ping, are there objections to avoid using unreachable, but trap instead ?
>>>
>>> No objection from me.
>>
>> I think we should use unreachable.
>>
>> This is a very real optimization that can have significant performance
>> impacts. Notably, it allows us to delete substantial amounts of code
>> that the C and C++ standard says *will not be executed*. We should
>> take full advantage of this.
>
> Is it a real optimization in practice, though?  This situation can't be formed
> or exposed by optimization;  you actually have to have a single function
> with a reachable implicit return site.  Where is this supposed real code
> that does this intentionally but cannot use noreturn attributes and
> unreachable markers?

Oh, the code *can* use unreachable markers, but it's not clear that we
should just give up on all of these optimization opportunities if the
user fails to use them.

Think about everywhere in LLVM's codebase that has a "covered" switch
on an enum. Technically there is a branch that falls off the end
(unless the enum has a power of two number of elements), and without
the llvm_unreachable we add everywhere, the optimizer preserves this.
We could require users to add such annotations, but "making debugging
crashes of optimized binaries easier" seems a weak argument. LLVM is
replete with optimizations which will make code with undefined
behavior crash in new and surprising ways. Avoiding this one is a drop
in the ocean.


>> To answer why we need the semantic unreachable to get these
>> optimization opportunities: in a word, inlining. When inlining
>> collapses the CFG of a function, having the unreachable hint can be
>> essential to selecting the proper representation.
>
> Can you expand on this?  How does inlining create this opportunity?

Using the example above of a switch over an enum, let's imagine that
after inlining the optimizer proves that the high bit is set in the
input, and the highest enumerator is that value: the top bit, and all
zeros. Now, if we have the unreachable, we can prove that there is a
single path through the CFG. If we don't, we have to assume that the
value might be *larger* than the largest enumerator.

Now, we might get lucky, and be able to deduce this from trap as well,
but it's going to be much harder depending on the code leading up to
the trap. Even if we succeed and deducing this from the trap, we'll
always preserve the test in case we need to actually do the trap
operation.



More information about the cfe-dev mailing list