[LLVMdev] Question: Bytecode Representation of Type Definitions Table

Chris Lattner sabre at nondot.org
Tue Aug 26 16:04:02 PDT 2003


> As far as I can figure, the type definition table itself starts back at
> 0x0e and I'm thinking that's because the label is the last thing that
> wouldn't have to be only part of a derived type.

Exactly right.  The types starting with the function type never appear
explictly in the table/they don't occupy a "slot".  Derived types are only
used to build concrete types from other things.  :)

> But it still seems to make some of the low entries in the table
> ambiguous (at least to me!).  I compiled a nice little hello world
> program into LLVM and then into bytecodes (see complete results
> attached).  Here is the start of the type definition table:

Ok.

> Entry 0x0e: Pointer to type 0x0f
> 0000001a  11 0f

Yes, since type 0x0F is '[14 x sbyte]', this is '[14 x sbyte]*'.  Forward
references are required for things like recursive types.

> Entry 0x0f: Array of SByte [14] (presumably for "Hello World!\n" constant)
> 0000001c  10 03 0e

Yup.

> Entry 0x10: Pointer to type 0x12
> 0000001f  11  |....n...n.......|
> 00000020  12

Yup: 'sbyte* (uint)*'

> Entry 0x11: Pointer to SByte
> 00000021  11 03

'sbyte*'

> Entry 0x12: Function returning Pointer ( UInt )
> 00000023  0e 11 01 06

'sbyte* (uint)

> Okay, so looking at entry 0x10: is it a pointer to Opaque or a pointer to a
> function returning Pointer ( UInt )?  I'm guessing the latter.  Similarly,
> entry 0x0e could be a pointer to Struct or a pointer to Array of SByte
> [14].  Again I'm guessing the latter.

You're right.  The parsing algorithm goes like this:

Read a byte.  This defines the 'typeid' to use for the type.  This is
ne of the values from the Type.h file, including things like
structure, pointer, opaque, function, ... as well as the primitive
types.

If it's a derived type, extra information is read indicating what type of
parameters there are for functions, which the pointee of a pointer is,
etc.  These type id's are type #'s, not primitive ID numbers.  You cannot
refer to a "generic" structure or function or anything like that.  Forward
references are allowed.

> I'm worried this low table stuff isn't unambiguous in all cases, but
> then again I'm a nervous guy.  If you could set my mind at ease with
> regard to the lack of ambiguity that would be great.

It seems to work so far.  :)  It should be ambiguous, we haven't had any
problems.

> And what's with this Opaque type anyway?  It's in the enum but I haven't
> found an instance of its use, unless of course it's used in entry
> 0x10.  The whole missing Opaque thing makes me nervous too.  It seems like
> it was just put there to be unclear.  :-)

Opaque type is for a type that does not have a definition yet.  In C, for
example, if you say 'struct foo;' and never provide the body, you get an
llvm type like:

%struct.foo = opaque;

Allowing you to build definitions like '%struct.foo*', etc.  Later, when
the type is resolved in the linking phase, all of these types are updated
to have their "true" values.

> But seriously, is it used for anything now?  Will it start to get used
> sometime?

It is used extensively for a lot of things, including the "forward
referencing" of types in the bytecode and asm files...

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/




More information about the llvm-dev mailing list