[llvm-commits] Enable early dup of any small bb
Rafael Avila de Espindola
respindola at mozilla.com
Fri Jun 10 13:05:15 PDT 2011
On 11-06-10 03:46 PM, Bob Wilson wrote:
> We did some experiments of early tail dup when we first created the
> pass and found that it made almost no difference in performance for
> anything except indirect branches. Did you benchmark more than just
> firefox? If you're seeing cases where there is a significant
> benefit, and if there are no regressions in code size, code quality
> or build times, we could consider doing this. I'd like to see
> benchmark results across a fairly wide range of tests and on multiple
> targets before deciding that.
The only two easy tests I have at hand are firefox and clang iself.
Would running the llvm-testsuite be a good measure?
What kind of problems were you having with early dup before? In the
cases I looked at the assembly, having an early dup with the same limit
as the late one just cleans up the code a bit. It doesn't change a lot
which blocks are duplicated, just when. Same idea for why it is good to
duplicate indirectbr early, just not as dramatic.
> In the longer term, I plan to add a separate "indirect branch
> duplication" pass, and I was hoping that could entirely replace the
> early tail-dup pass. The current tail duplication pass is not smart
> enough to do a good job for indirect branches. In order to get it to
> do what we need for indirect branches, we had to crank up the
> duplication limit fairly high, and then it blindly duplicates through
> multiple blocks, in some cases blowing up code size for no good
> reason. There are also cases where it fails because it cannot
> duplicate a block when a predecessor ends with a conditional branch.
> I haven't worked on that new pass for a while, but I wouldn't mind
> getting back to it, especially if you're seeing cases where it's
> needed.
The problem I have found have to do with the register allocator being
unhappy with the output of early tail dup. As a test, I tried reducing
the limit to the very minimum that would duplicate the original blocks
and stop. It helped a bit, but the register allocator was still
confused. Maybe in a case not as insane as jsinterp.o the decision of
what to duplicate is more important.
I guess the most important part is then not so much deciding what to
duplicate, but doing a good job at it. I am still reducing a case where
doing the duplication at clang produces better results than duplicating
just before regalloc (1/2 as many spills).
What were your plans for the indirectbr duplication pass?
> As you saw, tail dup for indirect branches has to happen before reg
> alloc to get good results.
Indeed.
Thanks,
Rafael
More information about the llvm-commits
mailing list