[cfe-dev] Reaching the end of a value-returning function in C++

Thu Oct 25 16:12:52 PDT 2012

On Oct 25, 2012, at 3:31 PM, Chandler Carruth wrote:
> On Thu, Oct 25, 2012 at 3:02 PM, John McCall <rjmccall at apple.com> wrote:
>> 
>> Is it a real optimization in practice, though?  This situation can't be formed
>> or exposed by optimization;  you actually have to have a single function
>> with a reachable implicit return site.  Where is this supposed real code
>> that does this intentionally but cannot use noreturn attributes and
>> unreachable markers?
> 
> Oh, the code *can* use unreachable markers, but it's not clear that we
> should just give up on all of these optimization opportunities if the
> user fails to use them.
> 
> Think about everywhere in LLVM's codebase that has a "covered" switch
> on an enum. Technically there is a branch that falls off the end
> (unless the enum has a power of two number of elements), and without
> the llvm_unreachable we add everywhere, the optimizer preserves this.

This is a fair point;  switch is a widespread pattern that does provoke this
accidentally.

> We could require users to add such annotations, but "making debugging
> crashes of optimized binaries easier" seems a weak argument. LLVM is
> replete with optimizations which will make code with undefined
> behavior crash in new and surprising ways. Avoiding this one is a drop
> in the ocean.

The problem is that 'unreachable' is a much stronger form of undefined
behavior than pretty much anything else in the system.  We do not generally
optimize sites committing undefined behavior by treating them as unreachable;
we used to do this much more aggressively, and it was absolutely lethal,
because the only way to debug such code is to carefully read the assembly,
discover that large chunks of your function have disappeared, recognize
the transformation, and try to find out why it happened.

In other words, it had to debugged by compiler writers.  Ahem.

>>> To answer why we need the semantic unreachable to get these
>>> optimization opportunities: in a word, inlining. When inlining
>>> collapses the CFG of a function, having the unreachable hint can be
>>> essential to selecting the proper representation.
>> 
>> Can you expand on this?  How does inlining create this opportunity?
> 
> Using the example above of a switch over an enum, let's imagine that
> after inlining the optimizer proves that the high bit is set in the
> input, and the highest enumerator is that value: the top bit, and all
> zeros. Now, if we have the unreachable, we can prove that there is a
> single path through the CFG. If we don't, we have to assume that the
> value might be *larger* than the largest enumerator.

Okay, so that example is totally fanciful and you should feel bad. :)
But there's a somewhat imaginable situation where our switching function
is called, and the call is in the shadow of a branch on a comparison
against one of the enumerators, and we could greatly simplify the switch
(possibly to nothing) if we can prove that the default case is unreachable.

So, basically, I don't like the idea of using 'unreachable' in arbitrary
positions here, but if it's practical to do it only after a covered switch, I
could get behind that.

John.