[LLVMdev] nested function's static link gets clobbered
Yale Zhang
yzhang1985 at gmail.com
Sat Nov 1 10:17:51 PDT 2008
I admit I got carried away with trying to use an extra static link when the
arg parameter would've sufficed. Using a static link is probably still a
better idea because if there are > 1 loop to parallelize in a function, they
would share the same parent frame struct but might have a separate structs
describing their parameters.
"you must be the first person to try using nest functions with the JIT :) "
Well, this is a project in a dynamic optimization course. The JIT lacks a
lot of things for this purpose like recompiling, patching old callers to
refer to the new code, and deleting old machine code - currently, it just
overwrites the old code with a branch to the new code and makes no attempt
to patch the callers. We'll probably come up with something more
sophisticated and submit it.
"If you look in X86JITInfo.cpp, in the function
X86JITInfo::emitFunctionStub,
you will see the code generating the stub and using r10"
I didn't expect it to be that easy. I thought I needed to add special rules
to the register allocator. I'll take a look at it.
On Sat, Nov 1, 2008 at 3:54 AM, Duncan Sands <duncan.sands at math.u-psud.fr>wrote:
> Hi,
>
> > I'm parallelizing loops to be called by pthread. The thread body that I
> pass
> > to pthread_create looks like
> >
> > define i8* @loop1({ i32*, i32* }* nest %parent_frame, i8* %arg)
> > parent_frame is pointer to shared variables in original function
> >
> > 0x00007f0de11c41f0: mov (%r10),%rax
> > 0x00007f0de11c41f3: cmpl $0x63,(%rax)
> > 0x00007f0de11c41f6: jg 0x7f0de11c420c
> > 0x00007f0de11c41fc: mov 0x8(%r10),%rax
> > 0x00007f0de11c4200: incl (%rax)
> > 0x00007f0de11c4202: mov (%r10),%rax
> > 0x00007f0de11c4205: incl (%rax)
> > 0x00007f0de11c4207: jmpq 0x7f0de11c41f0
> > 0x00007f0de11c420c: xor %rax,%rax
> > 0x00007f0de11c420f: retq
> >
> > I use init_trampoline to generate code that sets up the static link:
> >
> > 0x00007fffee982316: mov $0x7f48e1a08fb0,%r11
> > 0x00007fffee982320: mov $0x7fffee982330,%r10 the
> static
> > link
> > 0x00007fffee98232a: rex.WB jmpq *%r11
> >
> > The program crashes in loop1 on the 2nd instruction. r10, which contained
> > the static link was different from the value set by the trampoline.
> >
> > Upon closer inspection, it looks like the trampoline first jumps to a
> stub
> > that compiles loop1:
> >
> > 0x00007f48e1a08fb0: mov $0x5c61c0,%r10
> > 0x00007f48e1a08fba: callq *%r10
> > 0x00007f48e1a08fbd: int $0x0
> >
> > But that clobbers r10 which loop1 needs. According to the x86-64 ABI, r10
> > isn't preserved across functions, but here it needs to be. Is there
> anyway
> > to force LLVM to do that?
>
> you must be the first person to try using nest functions with the JIT :)
> If you look in X86JITInfo.cpp, in the function
> X86JITInfo::emitFunctionStub,
> you will see the code generating the stub and using r10. I think the right
> solution is to change r10 to a different call clobbered register. It would
> also be possible to have the trampoline use a different register, but since
> the x86-64 ABI explicitly states that r10 should be used for the static
> chain,
> I'd rather not.
>
> I'm also wondering about the x86-32 case. There are no comments in the
> JIT stub code in this case, so I'm not sure which register it is using.
> The problem with x86-32 is that there are so few registers, and for some
> calling conventions there is only one spare call clobbered register
> available. This is used by trampolines, so if it's also used by JIT,
> which is almost surely the case, that will cause trouble. Even worse,
> it looks like the JIT is wrong even without trampolines, because for
> the C and X86_StdCall conventions it is ECX that is spare, while for
> X86_FastCall and Fast it is EAX. Yet the JIT always uses the same
> hardwired code, and does not adjust according to the calling convention.
> So presumably it is broken for one of these sets of calling conventions.
>
> Hopefully Anton can comment on this.
>
> > I tried telling lli to compile the entire program
> > (-no-lazy) so that the stub won't be generated, but gives the error:
> >
> > LLVM JIT requested to do lazy compilation of function
> > '_Z41__static_initialization_and_destruction_0ii' when lazy compiles are
> > disabled!
> >
> > Any ideas?
> >
> > Note, I had to compile lli with -z execstack in order for trampolines on
> the
> > stack to work.
>
> Maybe lli can be taught to mark itself as having an executable stack when
> it sees a trampoline. I'm not sure how this can best be done. On linux
> I guess it can be done using mmap.
>
> Ciao,
>
> Duncan.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081101/13b14150/attachment.html>
More information about the llvm-dev
mailing list