[llvm-commits] [patch] Add a pass for duplicating indirectbr at the IL level

Bob Wilson bob.wilson at apple.com
Fri Jul 22 21:34:03 PDT 2011


On Jul 22, 2011, at 8:25 AM, Rafael Ávila de Espíndola wrote:

> On 11-07-22 2:44 AM, Bob Wilson wrote:
>> I've also been working on this problem.  I've resurrected the work I
>> had started last year to have a smarter strategy for duplicating
>> indirect branches.  I don't have anything ready to commit yet, but
>> it's coming along (slowly).  I'm trying to solve a broader set of
>> problems than you are tackling with this patch.  For example, to pick
>> just one, your new pass can still only duplicate past unconditional
>> branches.  That's not good enough for something like JavaScriptCore.
>> You also have the same problems as the existing pass with excessive
>> duplication.  But, since it might take me a while to finish my work,
>> I won't have any objections to taking something like this patch if it
>> doesn't conflict with where I think we should be heading in the
>> longer term.
> 
> I agree that the patch is not the final state. Hopefully it is going
> in the right direction, as for example the logic for updating ssa
> info can be reused. In fact, my plan with the patch was to get the
> smallest usable change and move from there. Some of my ideas for
> it are
> 
> * Add support for duplicating any small (1 instruction?) bb. This
> would make the old IL taildup pass fully redundant.
> 
> * Add support for duplicating "simple" bb to conditional branches.
> With pr10254 fixed it should make the MI level taildup fully redundant.

I'm all in favor of incremental development, but I don't know whether you're going in the right direction or not.

Your patch is for an "indirect branch duplication" pass, but your comments above suggest that you're planning to expand it to do much more general IL-level tail duplication.  Which is it?  And why?  I don't think it makes sense to have 3 separate taildup passes that are all doing essentially the same thing at different points in the compilation process.

> 
> BTW, what is your plan for duplicating into a conditional branch? I
> would be interested in implementing that too…

The problem I'm working on is specifically for indirect branches.  We need to essentially undo the front-end's transformation to use a single indirect branch per function.  (I'm actually including branches through jump tables here.  Those don't get the same treatment by the front-end, but the same motivation applies for duplicating them in some cases.)  The current taildup approach is very naive.  It cannot handle important cases like duplicating past conditional branches, and in other cases, it duplicates far more than necessary, which bloats code size and can hurt i-cache locality, etc.  The approach I'm taking it is to use dominance information to analyze the CFG and find regions of code to duplicate in selected locations.  You would not want to do this for anything except indirect branches.

> 
>> It looks to me like you've basically taken the existing MI-level
>> pass, moved it to the IL-level and added a special-case check for
>> whether the indirect branch address is coming from a load.  What if
>> it's not a load?
> 
> Take the "shouldDuplicate" logic as experimental. That worked for the
> case I had. The idea is that it was something that duplicating could
> simplify (phi of load of a phi) by duplicating.
> 
> I still have to test it a bit more.

Fine, but unless you can explain a bit more where you think this should go in the longer term, this looks to me like a fairly ad-hoc solution for one particular case.  I don't see any overlap between your new pass and the one I'm working on, and I can't imagine we want to have both of them.

> 
>> I'm assuming that you're doing this to expose the
>> load in the selection DAG, right?  An alternative might be a
>> target-specific peephole optimization to merge the load with the
>> indirect branch at the MI level?
> 
> That could work, but it would have to happen before the register
> allocator for best results. I did this at the IL level because of
> the nice results I got with my old patch "duplicating" (i.e. not
> merging) in clang.

I agree that a target-specific peephole is not the most elegant solution, and I'm sure your new pass nicely solves the specific problem you're looking at.  I just don't think we want to have 4 separate taildup/indirect-branch-dup passes (the 2 existing taildup passes, plus yours, plus mine).

> 
>> One big concern I have is that this patch would be forcing us to have
>> two separate implementations of the SSA updates, one here at the IL
>> level and another in the existing MI level early tail dup pass.
> 
> My hope is to make the one on MI redundant.

OK, that would make things more manageable.  But, there are some good reasons why we've been doing taildup at the MI level.

1) At the IL level it's hard to tell how many instructions you're duplicating.  Some IL instructions expand to large numbers of MI instructions, and it's different for each target.

2) There are some IL operations that don't duplicate very well.  For example, branches through jump tables where the table is located in a PC-relative way.  When duplicating at the MI level, you can check and see that an instruction is not duplicable, but you can't tell at the IL level.  The jump table case is one that I actually care about, and the solution (if it's really important to duplicate the jump) is to use a jump table that can be shared.  But, if you start duplicating jump tables at the IL level, you can end up with some really awful blowups.

I wouldn't be opposed to moving my new pass to work at the IL level.  I'm pretty sure it would subsume your patch, at least in its current state, except that I don't know how to handle issue (2) above.  I wouldn't want to combine it with any more general taildup, either.

> 
> > As
>> you should remember, I was pretty hesitant to go along with your
>> recent changes to the early tail dup pass, because I was hoping to
>> remove it entirely, thereby removing all the SSA complexity from tail
>> duplication.
> 
> I hope to remove the one in CodeGen. I disagree that the complexity
> involved is particularly bad. The IL version is simpler and if we can fix one of the "FIXME" in SSAUpdater itself, it will be
> down to one corner case (adding an available value when we don't duplicate), which is on the same level of things like simplifycfg.
> 
> >  But, since you showed that there were some significant
>> benefits, I agreed that it was the right thing to do, with the hope
>> that we could share the SSA updating code between early tail dup and
>> the smarter indirect branch dup pass that I'm working on.  I suppose
>> we might be able to templatize that SSA updating code in the same way
>> that the SSAUpdater is templatized to work at both levels, but that's
>> a big hassle all around and I'd really like to avoid it.
> 
> Yes, that is my plan B: do the same that was done for SSAUpdate and
> create IL and MI traits and move the high level logic to a
> template.
> 
> Thanks,
> Rafael





More information about the llvm-commits mailing list