[LLVMdev] JIT allocates global data in function body memory

Tue Jun 30 11:18:54 PDT 2009

On Mon, Jun 29, 2009 at 5:50 PM, Dale Johannesen<dalej at apple.com> wrote:
>
> On Jun 29, 2009, at 5:41 PMPDT, Reid Kleckner wrote:
>
>> So I (think I) found a bug in the JIT:
>> http://llvm.org/bugs/show_bug.cgi?id=4483
>>
>> Basically, globals used by a function are allocated in the same buffer
>> as the first code that uses it.  However, when you free the machine
>> code, you also free the memory holding the global's data.  The address
>> is still in the GlobalValue map, so any other code using that global
>> will access freed memory, which will cause problems as soon as you
>> reallocate that memory for something else.
>>
>> I tracked down the commit that introduced the bug:
>> http://llvm.org/viewvc/llvm-project?view=rev&revision=54442
>>
>> It very nicely explains what it does, but not why it does it, which
>> I'd like to know before I change it.  I couldn't find the author
>> (johannes) on IRC so ssen told me to ask LLVMdev about this behavior.
>
> That's me (and I'm not on IRC because I like messages to be
> archived).  The reason everything needs to go in the same buffer is
> that we're JITting code on one machine, then sending it to another to
> be executed, and references from one buffer to another won't work in
> that environment.  So that model needs to continue to work.  If you
> want to generalize it so other models work as well, go ahead.

So, you're moving code across machines without running any relocations
on it? How can that work? Are you just assuming that everything winds
up at the same addresses? Or is everything PC-relative on your
platform, so all that matters is that globals and the code are in the
same relative positions?

How are you getting the size of the code you need to copy?
MachineCodeInfo didn't exist when you wrote this patch, so I assume
you've written your own JITMemoryManager. Even then, if you JIT more
than one function, and they share any globals, you have to deal with
multiple calls into the MemoryManager and functions that use globals
allocated inside other buffers. You should be able to deal with having
separate calls to allocate global space and allocate code space. You'd
just remember the answers you gave and preserve them when copying to a
new system.

I'd like freeMachineCodeForFunction to avoid corrupting emitted
globals, and with the current arrangement of information within the
JIT, that means globals and code have to live in different
allocations. I think Reid's suggesting a flag of some sort, with one
setting for "freeMachineCodeForFunction works" and another for
"globals and code are allocated by a single call into the
MemoryManager." I'd like to avoid new knobs if it's possible, so do
you really need that second option? Or do you just need globals to be
allocated by some call into the MemoryManager?

Thanks!
Jeffrey