[LLVMdev] dynamic typing system

Mon Aug 16 14:55:01 PDT 2010

If your data needs to outlive the function it was allocated in you
need to use heap allocation (either malloc or a custom allocation/gc
-- just call it as a C function for now, for some allocators you can
inline it directly into your code).
To calculate the size for allocation either use TargetData or use GEP
trick (http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt).
Note that in your struct you don't gain much by declaring type as i8
-- pointer that follows it will be aligned at 4 or 8 bytes, thus
creating a hole after i8. Type is usually stored as a pointer to some
type info or method table. Dynamically typed languages usually put it
as the first word of every object and use "plain" pointers to objects.
Sometimes "fat" pointers (a pair of pointer to type and pointer to
data) are used.

Eugene

On Mon, Aug 16, 2010 at 9:26 PM, Alec Benzer <alecbenzer at gmail.com> wrote:
> This isn't a strictly llvm-related problem, but I thought I'd ask anyway to
> see if anyone can help.
> I'm trying to write a dynamically typed language on top of llvm. My initial
> idea was to have a general object type for all objects in my language. I
> came up with:
> { i8, i8* }
> the first element of the structure would hold the type of the object, and
> the second is a pointer to the actual data.
> Now, I'm not exactly sure how to get my data allocated somewhere in order to
> be able to get a pointer to it. My initial thought was heap allocations,
> though there doesn't seem to be any llvm instructions that perform heap
> allocations, though I imagine you just use C's malloc and free? If you do
> use those, however, is there a way of getting the byte-size of a type, to
> know what to pass to malloc? There's also the issue of having to know when
> to be able to free() the pointers.
> The other option, I guess, would be stack allocations with alloca
> instructions? I don't need to worry about the sizes of types or about
> calling free, but now my objects can't live on past the scope of a function,
> which may complicate things. For instance, if at my jiting repl (set up like
> the Kaleidoscope tutorial, where top-level expressions are wrapped in
> lambdas and then executed), I type in "5", the repl should spit 5 back to
> me. If I use allocas here there isn't a problem. But if I define a global
> variable and assign 5 to it, the data I alloca'd is going to be gone after
> the anonymous function returns. This makes it seem like heap allocations
> would be a better choice.
> So basically, I'm sort of stuck not knowing the best way to implement this
> (or which way will even be possible). I'd appreciate any input/guidance on
> how to proceed.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>