[LLVMdev] Area for improvement

Tue Feb 22 08:29:04 PST 2005

On Tue, 22 Feb 2005, Vikram S. Adve wrote:

>> The second issue is that we need to do redundancy elimination and hoisting 
>> on operations that are only exposed once instruction selection is 
>> performed.  Getelementptr expansion is just ONE particular case of this, 
>> but it can happen with any instructions (including the use of large integer 
>> (or any FP) constants on RISC machines, addressing globals with PIC code, 
>> handling extended/truncated operations (for RISC machines with a single 
>> integer size), etc.  Note that hacks like "LowerMultiDimRefs" and the sparc 
>> Preselection pass sorta-kinda work around SOME of this,
>
> Actually, they are capable of working around much of this, and most such 
> optimizations are actually quite architecture-independent, e.g., 
> LowerMultiDimRefs, which is a major optimization for array-intensive 
> languages.

I obviously agree that high-quality codegen of getelementptr instructions 
is very important!  I have two primary problems with LowerMultiDimRefs:

1. We don't want the code generator munching on the LLVM code as it
    generates code.  The code generator should only read the LLVM code.
    Right now if we do thinks like a "quick JIT" followed by a "good code
    JIT" (when we find that a piece of code is hot), the second JIT gets
    different LLVM code than the first one had as input.  We are still not
    quite to the point where this is the case, but we are getting there.
2. The way LowerMultiDimRefs works is by decimating GEP instructions into
    pieces, then relying on GCSE and LICM to clean them up.  The problem
    with this is that GCSE and LICM are not the trivial passes that you'd
    like them to be for a simple cleanup job like this.  In particular,
    GCSE requires a value numbering implementation, LICM requires
    loop info and alias analysis.

Finally, while the *code* for MultiDimRefs isn't target specific, the 
decisions it makes *are*.  As a trivial example, it should decompose the 
following GEP on most risc machines:

   t2 = GEP P, 1000000, X
   load t2

Assuming the immediate doesn't fit into the immediate field of an
instruction, this GEP will lower to multiple instructions, hypothetically 
like this (assuming X indexes into a byte array):

   t1 = P + 900000
   load t1 + X + 100000

In this case, the addition of 90000 could be loop invariant, a common 
subexpression, etc.

> I don't claim such choices are the right long-term solution but the right 
> long-term solution is exactly what you're implying below - a new optimization 
> framework within the back-end, working on it's own representation, etc.  This 
> has taken 4 years and is still under development.  This is why what you call 
> "hacks" are sometimes the right choice in the short term.

I can appreciate that the LowerMultiDimRefs pass made a *lot* of sense 4 
years ago, but I'm concerned about moving forward.  Now we have more of 
the infrastructure that is needed to move beyond it.  Also, the new 
instruction selection framework is only a couple months old, not 4 years. 
:)

-Chris

-- 
http://nondot.org/sabre/
http://llvm.cs.uiuc.edu/