[LLVMdev] llvm register coalescing
maurice.marks at gmail.com
Fri Feb 24 08:18:14 PST 2012
I wanted to update the llvm list on an offline discussion I was having with
Rafael about a problem we have been seeing trying to compile an
"interpreter" type of program with clang. It was producing a huge number of
spills, something we had seen in llvm 2.8, and it seemed to have recurred
in 3.0. Following Rafael's advice we added -disable-early-taildup to llc
and the spills disappeared. It would be nice if we could pass such switches
directly through the clang command line. Is there a way to do that?
thanks for your help Rafael!
2012/2/21 Rafael Ávila de Espíndola <rafael.espindola at gmail.com>
> On 21/02/12 11:35 AM, Maurice Marks wrote:
> > Hi Rafael. I have an llvm question for you. I'm a developer on a project
> > that has been using llvm as a Jit for a dynamic binary translator. A
> > while back (llvm 2.8 days) we tried using llvm-gcc on an interpreter.
> > The structure of the interpreter is lots of shared state that is used in
> > code fragments joined by indirect jumps. The code produced by llvm (and
> > the compilation time) was just terrible - many register spills and
> > reloads. Compared to gcc it was more than an order of magnitude worse in
> > both compile and execution time. You fixed a bug in that area and things
> > have been much better since then.
> > Up until now. We just recompiled that code (with clang/llvm 3.0) and it
> > has many of the same problems as before - huge compile times,very poor
> > execution time. I noticed that someone else reported a similar problem
> > with interpreter-like code
> > (http://lists.cs.uiuc.edu/pipermail/llvmbugs/2011-August/019336.html
> > <
> > I regard you as an expert in this area of optimization so I'd like to
> > understand how, in a program structure that has a complex, or
> > indeterminate cfg, should shared state be bound to registers? Based on
> > the frequency of reference there is pressure to bind a lot of state to
> > registers, but that would result in overflowing the real registers very
> > quickly, resulting in spills and reloads. But keeping all the state in
> > memory is also non optimum. gcc seems to be able to find a sweet spot,
> > referencing shared state in memory, but in the basic blocks between
> > indirect jumps keeping frequently used state in registers.
> > What exactly was the algorithm fix you applied about a year ago? Could
> > it have become undone in some way?
> > As a ex-compiler developer I would expect you to just ask for a test
> > case, but in our particular situation, and the case of the other person
> > who reported a similar thing, the program has to be large and complex
> > before a bug appears. However the structural features are the same -
> > lots of frequently accessed shared state, indeterminate (due to indirect
> > jumps) control flow graph.
> > I'm interested in any comments or suggestions you have on this topic,
> > and I thank you sincerely for your many contributions to llvm.
> Hi Maurice,
> At the time I was benchmarking firefox builds with clang. In most files
> clang did better, but on the interpreter the result was *really* bad.
> Investigating it found some issues
> * The register coalescer was failing to joint many copies. This was
> fixed by 134199, but it is possible there are more cases the coalescer
> is not handling.
> * Tail duplication was done way too late, preventing other optimizations
> from taking advantage of it. I moved it a bit earlier in 134372, but it
> should really be done at the IL level.
> Two things I would suggest trying
> * Disable tail duplication completely. If the big problem is the
> register allocation doing a bad job, this should make its life easier
> and help you find what improvements the register allocator needs.
> * Try the patch I posted for clang making it duplicate the indirectbr
> from the very start. This will make compile time *really* slow, but
> should show if doing early tail duplication would help in your case.
> It would be really nice if you could post the result to the list :-)
> > regards
> > /maurice marks
Not sent from my Blackberry, Raspberry or Gooseberry!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev