[LLVMdev] Two important changes to the getelementptr instruction

Chris Lattner sabre at nondot.org
Mon Apr 5 00:03:02 PDT 2004


On Sun, 4 Apr 2004, Reid Spencer wrote:

> Congrats on getting this taken care of finally.  I know its something
> you've wanted to do since 1.0.

It certainly has been on the grand TODO list for a long time :)

> I have one question.  How does LLVM disambiguate between a uint used for
> a structure and a uint used for an array?

It depends on the operand number of the getelementptr instruction.  For
example:

%P2 = getelementptr {int}* %P, uint 4

In the above example, we know that the uint is indexing "through" the
pointer in the getelementptr instruction, so it is not an index into the
structure.  In this case:

%P3 = getelementptr {int}* %P, uint 4, uint 0

The first index is just like the previous case, but the 'uint 0' is now
indexing through the structure, so the number is a field number.  The big
distinction is that you can index with variables in arrays and pointers,
but only constant integers in structures.

> My assumption is that LLVM is aware of the type of the thing being
> indexed all the way through the dereference so it doesn't really matter
> what index type is being used.

Very true.

> If that's the case, then why restrict structure indexes to uint and not
> make it "any integer type"? This might further reduce casting and
> generalizes the programming model.

The idea is that structure indices must be *constants*.  As such, it
doesn't really matter what type we make them, as long as there is enough
"addressing space" to represent enough fields.  Allowing structure fields
to be "any type" is possible, but wouldn't give us anything: uint 7 means
the same thing as sbyte 7.

The disadvantage of allowing any integer type for the structure index is
that we have to encode the integer type used in the bytecode file, which
takes space.  Right now the type of the index is implicitly encoded into
the bytecode file based on what operand of the getelementptr it is.
Another advantage is that clients of the LLVM IR can assume that all
structure indexes are instances of ConstantUInt, which simplifies
construction and modification of the IR a bit.

-Chris

> On Sun, 2004-04-04 at 19:00, Chris Lattner wrote:
> > Hi all,
> >
> > I just checked in a series of patches that makes some substantial changes
> > to the LLVM getelementptr instruction.  In particular, in LLVM 1.2 and
> > earlier, the getelementptr instruction required index operands for
> > structure types to be 'ubyte' constants and index operands for sequential
> > types to be 'long' values.  This had several problems, most notably that
> > it was impossible to represent a structure with more than 256 fields, and
> > performance for 32-bit targets suffered.
> >
> > Now in CVS, structure indices are required to be 'uint' constants, which
> > permits structures with up to 2^32 elements.  Sequential type indices are
> > allowed to be any int, uint, long, and ulong values.  This is documented
> > in the LLVM language reference here:
> > http://llvm.cs.uiuc.edu/docs/LangRef.html#i_getelementptr
> >
> > This change required modifying the encoding of getelementptr instruction
> > in bytecode files, so LLVM 1.3 bytecode files won't be readable by LLVM
> > 1.2 and earlier.  LLVM .ll files and .bc files from earlier releases, as
> > usual, will work fine with the newer version.  In particular, we will
> > always support .ll files that use ubyte indexes for structure types, as
> > the parser auto-upgrades them to uint indexes.
> >
> > This change fixes the following bugs:
> > http://llvm.cs.uiuc.edu/PR82
> > http://llvm.cs.uiuc.edu/PR309
> >
> > It also results in cleaner and simpler LLVM code, allowing many integer
> > casts that were present for array indexes to go away.  Though the encoding
> > of getelementptr instructions is slightly less efficient than before
> > (because we have to store what type the various sequential type operands
> > are), the reduction in cast instructions actually results in a bytecode
> > size reduction of about 1%.
> >
> > If you have any questions about this change, please let me know,
> >
> > -Chris
>
>
> _______________________
> Reid Spencer
> President & CTO
> eXtensible Systems, Inc.
> rspencer at x10sys.com
>

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/




More information about the llvm-dev mailing list