[llvm-commits] Enable early dup of any small bb

Rafael Avila de Espindola respindola at mozilla.com
Fri Jun 10 13:05:15 PDT 2011


On 11-06-10 03:46 PM, Bob Wilson wrote:
> We did  some experiments of early tail dup when we first created the
> pass and found that it made almost no difference in performance for
> anything except indirect branches.  Did you benchmark more than just
> firefox?  If you're seeing cases where there is a significant
> benefit, and if there are no regressions in code size, code quality
> or build times, we could consider doing this.  I'd like to see
> benchmark results across a fairly wide range of tests and on multiple
> targets before deciding that.

The only two easy tests I have at hand are firefox and clang iself. 
Would running the llvm-testsuite be a good measure?

What kind of problems were you having with early dup before? In the 
cases I looked at the assembly, having an early dup with the same limit 
as the late one just cleans up the code a bit. It doesn't change a lot 
which blocks are duplicated, just when. Same idea for why it is good to 
duplicate indirectbr early, just not as dramatic.

> In the longer term, I plan to add a separate "indirect branch
> duplication" pass, and I was hoping that could entirely replace the
> early tail-dup pass. The current tail duplication pass is not smart
> enough to do a good job for indirect branches.  In order to get it to
> do what we need for indirect branches, we had to crank up the
> duplication limit fairly high, and then it blindly duplicates through
> multiple blocks, in some cases blowing up code size for no good
> reason.  There are also cases where it fails because it cannot
> duplicate a block when a predecessor ends with a conditional branch.
> I haven't worked on that new pass for a while, but I wouldn't mind
> getting back to it, especially if you're seeing cases where it's
> needed.

The problem I have found have to do with the register allocator being 
unhappy with the output of early tail dup. As a test, I tried reducing 
the limit to the very minimum that would duplicate the original blocks 
and stop. It helped a bit, but the register allocator was still 
confused. Maybe in a case not as insane as jsinterp.o the decision of 
what to duplicate is more important.

I guess the most important part is then not so much deciding what to 
duplicate, but doing a good job at it. I am still reducing a case where 
doing the duplication at clang produces better results than duplicating 
just before regalloc (1/2 as many spills).

What were your plans for the indirectbr duplication pass?

> As you saw, tail dup for indirect branches has to happen before reg
> alloc to get good results.

Indeed.

Thanks,
Rafael



More information about the llvm-commits mailing list