[LLVMdev] Getelementptr woes

Chris Lattner sabre at nondot.org
Thu Jun 17 12:06:02 PDT 2004


On Thu, 17 Jun 2004, Vladimir Prus wrote:

> I'm having problems with the following LLVM instruction
>
>   %tmp.0.i = call int (sbyte*, ...)*
>         %printf( sbyte* getelementptr ([11 x sbyte]*  %.str_1, long 0, ......
>
> The first argument in function call,
>
>     sbyte* getelementptr ([11 x sbyte]*  %.str_1.....
>
> appears to be ConstantExpression*, and my backend does not support
> ConstantExpression yet.

Ok.

> I probable can implement that, and getelementptr instruction too, but I
> wonder if I need to. Looking at X86 backend, I see it does some smart
> things, like folding getelementptr into instruction which uses it, and
> using 'lea' asm instruction to perform addressing. On my target, neither
> of this is possible -- there's just no such fancy addressing modes.

Sure.  Regardless of whether you do fancy folding or not, you should still
handle constant expressions (see below).  Constant expressions exist for a
couple of important reasons and they aren't too horrible to handle
correctly. :)  Also, ConstantExprs come in several different variaties,
not just getelementptr ones.

> So I wonder if I can away by writing two passes which work on LLVM level.
> The first pass would extract all ConstantExpr* operands from instructions and
> move them into separate instructions. E.g. the example above would become:
>
>   %tmp.1 = sbyte* getelementptr ([11 x sbyte]*  %.str_1, ..........
>   %tmp.0.i = call int (sbyte*, ...)*
>         %printf(%tmp.1 , .........

While this is feasible and possible, I really wouldn't recommend it.  Take
a look at how the X86 backend handles this stuff.  For shift instructions,
for example, we have:

void ISel::visitShiftInst(ShiftInst &I) {
  MachineBasicBlock::iterator IP = BB->end();
  emitShiftOperation(BB, IP, I.getOperand(0), I.getOperand(1),
                     I.getOpcode() == Instruction::Shl, I.getType(),
                     getReg(I));
}

Basically all of the code for emitting shift instructions is in the
emitShiftOperation which takes two Value* operands to shift and a virtual
register (the last operand) to set to the result of the shift.

Because of this pattern, emitShiftOperation is used by the constant
expression support routines (see copyConstantToRegister) to codegen shift
constant expressions as well as shift instructions themselves.

I don't think this is TOO completely horrible to support constant
expressions in general, though it can (and will be in the future)
certainly be improved.

> The second pass would convert getelementptr into casts and pointer
> arithmetic.

This is something else I would not recommend.  The X86 "simple"
instruction selector is not really simple anymore, so it's not a good
model to follow for how to emit GEP instructions in a straight-forward
way.  Try taking a look at the emitGEPOperation method (line 3211) of
revision 175 of X86/InstSelectSimple.cpp though.  This version is from
before the various address optimizations were implemented.  Here's a
direct link:
http://llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/lib/Target/X86/InstSelectSimple.cpp?rev=1.175

To expand out a getelementptr into code it basically just walks through
the indices turning them into the appropriate scale and add as
appropriate.  It's quite a bit simpler and cleaner to just implement
support for this instead of hacking it into casts and pointer arithmetic.
:)

Note that getelementptr has been generalized a bit since the code in r175.
We now support pointer and array indices of int/uint/long/ulong type,
where before we just supported long.  Also, the type of the structure
index is uint, where before it was ubyte.

> As a result, my backend would not care about both ConstantExpr and
> getelement ptr, both of which look scary. The biggest advantage is any
> target which does not have fancy addressing modes can just use those two
> passes.

I think that is a much better idea to just implement direct support for
these.  It will make your code generator more efficient and more
straight-foward.  Future codegen improvements will make it *MUCH* less
painful to add support for these, so the two extra passes have limited
long-term potential.

Hopefully I can just dispell some of the scariness of constantexpr and
getelementptr. :)  If you have any questions about the above, feel free to
ask of course.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/




More information about the llvm-dev mailing list