[LLVMdev] Missed optimization with indirectbr terminator
rjmccall at apple.com
Fri Jul 8 00:05:07 PDT 2011
On Jul 7, 2011, at 10:15 PM, Carlo Alberto Ferraris wrote:
> Nella citazione giovedì 7 luglio 2011 19:41:16, John McCall ha scritto:
>> On Jul 7, 2011, at 4:33 AM, Carlo Alberto Ferraris wrote:
>>> Il 07/07/2011 11:14, Cameron Zwarich ha scritto:
>>>> I haven't read the code in detail, but it looks like JumpThreading at least attempts to thread across indirect branches. You can either try to fix it or file a bug with your test case.
>>> In the source it says "If the predecessor is an indirect goto, we can't split the edge. <http://llvm.org/docs/doxygen/html/JumpThreading_8cpp_source.html#l00914>" Why would that be the case?
>> Splitting an edge creates a block which executes when leaving a
>> specific block to go to a specific successor. The only way to split an
>> indirect goto is to insert code before the jump which checks for a
>> specific destination address. This is very likely to be a pessimization.
> Do you mean like turning a indirectbr %addr, [%a, %b, ...]
> into a switch %addr, undef, [blockaddr(%a), %a], [blo
> ckaddr(%b), %b], ...
> ? (I know that the switch doesn't accept a pointer as first argument, it's just an example)
That difference is relevant, though, because there's no way to
implement such a switch except as a series of pointer comparisons
(or at least offset comparisons). But I was approaching it from
the perspective of trying to split a single edge, rather than changing
the entire terminator; there's no natural way to do that with an indirect
branch, so you have to do explicit pointer comparisons before the goto.
Doing even one or two of those could easily kill a lot of the benefit of
the indirect branch.
Basically, generic optimizations should not be in the business of
trying to split indirect branch edges, which is why the generic
splitting routines fail on them.
> I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able to do this?
I don't think anyone's taught the loop optimizations to work
over indirect branches, but it's not impossible. This specific
case would be better fixed by teaching some existing pass,
maybe JumpThreading, that an indirectbr on a phi with constant
blockaddr operands can be usefully optimized.
More information about the llvm-dev