[llvm-commits] Enable early dup for small bb, take2
Rafael Avila de Espindola
rafael.espindola at gmail.com
Mon Jun 13 11:16:17 PDT 2011
On 11-06-13 01:29 PM, Bob Wilson wrote:
> It seems to me that this isn't a clear win. It helps some cases but
> hurts others.
The cases I have looked at, the hurt is by luck (different registers,
same code) or because other passes do something silly when given a
reduced IL. Take a look at the two logs:
http://people.mozilla.com/~respindola/patch.log.bz2
http://people.mozilla.com/~respindola/trunk.log.bz2
It is actually funny :-)
> As you've seen, updating PHIs for tail duplication is tricky. I'd
> really prefer to avoid that. If we only run the taildup pass after
> regalloc, we can remove all that complexity. Something similar would
> still be needed in the separate indirect branch duplication pass
> (that I'm still working on), but at least we wouldn't have to do it
> in taildup as well.
>
> How important do you think it is to do this? Am I misreading your
> data?
>
I do think it is important. The way I read the data is that there is
useful cleanup that duplicating small blocks can do. Some passes run
afterwards can currently make bad decisions on the new input, but that
is a problem that should be fixed on them.
Ideally, the blocks the early pass is duplicating are the same ones the
late one would. So this is really just cleaning it up.
One thing that was surprising even to me was the clang became a tiny bit
faster. I guess because it is passing fewer blocks down the pipeline.
I started looking at this because my old patch (duplicating indirectbr
in clang) shows that having more cleanup happening from the duplication
to the register allocator can help firefox.
Note that the speed improvement in firefox was measured in a full js
benchmark. I can run instruments on it if you are curious on what the
impact was on the JS interpreter only.
As for correctness, I would argue that it is safer to have code that is
executed (and therefor tested) more often. The issues I fixed were found
by increasing the dup size limit to 8 and bootstrapping clang. The bugs
were there and are real, it is just hard to trigger then with an
indirectbr only pass (as early dup is right now). When someone does hit
them, they would have been incredibly harder to debug.
Cheers,
Rafael
More information about the llvm-commits
mailing list