[LLVMdev] x86 cogen quality

Wed Apr 21 09:17:02 PDT 2004

On Wed, 21 Apr 2004, Finn S Andersen wrote:

> Hi, I have a question about x86 code quality.
>
> I have run a few benchmarks and compared the
> running time of executables created by LLVM to
> executables created by gcc.
>
> It appears that code generated by LLVM is x1.5 - x3
> times slower than code generated by gcc, for the x86
>
> For some of the benchmarks the linear scan regalloc
> works. When it does, results are in the x1.0 - 1.5
> range. Unfortunately, the linear scan allocator breaks
> on most of my code.
>
> Question:
> 1) Do my observations fit your general experience ?

Yes, that does.  I assume you are working with LLVM 1.2?

> I haven't looked into the details of the generated
> x86 code. I have the following observation, though:
>
> When using gcc as a backend (compiling to the 'c' target
> and then recompiling with gcc) results are generally a lot
> better than just using the LLVM->x86 backend. This
> indicates that the performance difference is mostly
> located to the LLVM->x86 backend. Further, for those
> of my codes where the new allocator works, results are
> much better. Whether this is due to the allocator, or
> some interaction between it and cogen, I do not know.

The LLVM 1.2 X86 code quality problems are due to a couple of serious
issues.

1. The default register allocator is a purely local algorithm, which
   cannot hold (e.g.) the counter of a loop in a register across the loop.
   This is *clearly* bad, and switching to the new allocator obviously
   makes a big difference :)
2. Even with the new allocator, we are not able to globally allocate
   floating point registers (yet), do to some interaction with the X86
   floating point stack.  This is just something that needs to be worked
   on, but unfortunately noone has had time to do the work recently.
3. When compiling with the native X86 backend, very little additional
   optimization is performed.  When compiling with the C backend & GCC,
   GCC does it's own optimizations that can make a big difference.  For
   example, LLVM 1.2 could only index into arrays with 64-bit integers
   (the getelementptr only accepted a 'long' operand).  This could cause
   huge performance problems on the X86, which the GCC optimizer happily
   stomped out.  (this issue has been fixed in LLVM CVS:
   http://llvm.cs.uiuc.edu/PR309)
4. in LLVM 1.2, several LLVM->LLVM optimizations were doing very obviously
   silly things, and have subsequently been fixed.  See the "1.3" release
   notes for information: http://llvm.cs.uiuc.edu/docs/ReleaseNotes.html
5. One of our goals for LLVM 1.3 is to get one of the scalable pointer
   analyses that I have been working on turned on by default in the
   optimizing linker.  This should have a pretty noticable performance
   impact.

> Currently, I am just playing with LLVM, but the longterm
> plan is to build a new backend for a new machine. It won't
> be register starved as the x86 is.

Of the above, #1 would directly effect your target, #2 is X86 specific, #3
would have affected your target if it's 32-bit or smaller, #4 would have
hurt your target, and #5 will almost certainly help your target.

> Question:
> 2) Is there a similar performance differential between
>     LLVM->sparc and gcc on sparc, or are they much closer
>     because the sparc has more registers and thus should
>     be less dependent on good register allocation ?

I truly have no idea.  I don't use the Sparc target very much, and I don't
know if anyone has looked into the actual performance of it.  One of the
problems is that the LLVM Sparc backend doesn't share much code with the
target-independent code generator, so it's very hard to compare.  Our
long-term goal is to merge the sparc code generator into the
target-independent code paths.

> 3) What is the expected timeframe for the new regalloc to
>     become stable ?

I am hoping/planning for the new allocator to be in LLVM 1.3 as the
default allocator.  From what I understand there is one bug left related
to spill code insertion, but Alkis has been very busy with other projects
(it's nearing the end of the semester already :).  If he doesn't get to
it by 1.3, I will.

>     .. or perhaps I should make a more general
>     question: what is the perceived status in terms of performance
>     for the two compiler backends and for the compiler backend
>     part of the infrastructure ?

At this point we haven't actually spent a lot of time evaluating and
measuring code quality.  In fact if you notice a piece of code that is not
being optimized or code generated well, please file a bug (with a
suggestion on what the code should have been compiled to).  Generally we
separate optimizations in the catagories of LLVM->LLVM or codegen
optimizations, but both are important.

> Finally I think LLVM looks *very* nice and appears to be a substantial
> contribution to the world of open source compiler infrastructure.

Thanks!  If you have any more questions, please feel free to ask.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/