[LLVMdev] Execution of generated code after multiple compiles fails

Tue Apr 17 17:38:30 PDT 2007

On Tue, 17 Apr 2007, Chuck Rose III wrote:
> I'm having a tricky time diagnosing something that is going on in my
> program and am hoping some of you might have used LLVM in a similar way
> before.  All of this is using LLVM 1.9 on Mac OSX.  Here is our usage
> pattern:

Ok.  A lot of changes have occurred since LLVM 1.9, but I'll try to help 
:)

> 1.	Read in a program in a language we are designing
> 2.	Transform it into LLVM IR using the llvm class hierarchy
> 3.	Link this module to a set of support functions written offline
> in C and compiled using the llvm compiler into LLVM IR bytecode
> 4.	Construct a JIT execution environment for X86
> 5.	Run the function

Sounds good.

> 6.	Destroy the execution environment, module, linker, etc.  Note
> this doesn't destroy all LLVM state as there are some things static to
> classes such as the machine code emitters.

Right.  In llvm mainline, you can use the llvm_shutdown() function to 
destroy all this state (it will auto resurrect itself if you start using 
it again).  I don't think it existed fully in 1.9.

> Under the hood, the JIT callbacks are occurring and are doing the native
> compilation on the necessary support functions from the read-in
> bytecode.

Okay.  Another design point, if you're interested, is to not compile your 
runtime code to LLVM at all.  Instead, you could just  compile it to 
native code and link it into your app.  Given this, you have two choices:

1. You can let the JIT autoresolve it in your address space.  It defaults
    to calling dlsym on the current process to find symbols it doesn't know
    about.
2. You can manually populate the mapping of LLVM function prototypes to
    native functions with the ExecutionEngine::addGlobalMapping method.

This can be good if you're looking to reduce memory use and JIT time, but 
it won't work if your goal is to do things at runtime with these routines, 
like inline and optimize them into the code that is calling them.  :)

> The first few times this process is done, everything works
> fine.  Repeating this process for the same program and support-function
> combination resulted in crashing on the second iteration.

Hrm, that's not good.

> Stepping through the program got me as far as having it
> 1.	Run through the whole process as described above with correct
> results
> 2.	In iteration #32, compile the primary function I want to run
> 3.	Compiling the stub function to turn my primary function into a
> null-ary function
> 4.	Getting to the X86 JIT callback
> 5.	Compiling the first support function via JITter
> 6.	Entry into the first support function
> 7.	Crashing somewhere before getting back to the next callback.
> Specifically we get a "Program received signal: 'EXC_BAD_ACCESS'"
> message and then loose all stack info.  It won't break on the offending
> instruction, only after the game is already lost.

Do you know if it is running out of code space to JIT into?  Are you 
building with assertions on?

Another thing to try:  if you are using lli, you can pass "lli 
-debug-only=jit <other args>" which will dump out information about the 
JIT as it runs (you can also pass -debug, to get tons of info).

You probably aren't using lli, and probably don't expose the llvm command 
line arguments through your app.  However, you can still access it through 
a few different ways.  One way is to call something like:

cl::ParseEnvironmentOptions("myappname", "LLVM_DEBUG_OPTIONS");

and then do "setenv LLVM_DEBUG_OPTIONS -debug-only=jit"

You can also cons up a fixed array of options and pass it to 
cl::ParseCommandLineOptions.  This is what llvm-gcc does, for example.

Another useful option is "-print-machineinstrs" which dumps out code that 
is being constructed.  You can then disassemble the generated code in GDB 
and see if it matches (i.e. if you suspect a JIT encoding bug).

> Unfortunately, XCode is fairly uninformative when you step into code
> you've just created, so you have to go on memory dumps, watching the
> registers, and doing the disassembly in your head.  Once it calls back
> into the callbacks its brain resets and gives you a real debugging
> environment again.

Right.  It's hard to say what is going on here.  It could be the JIT 
miscompiling something (unlikely if it works the first time though), maybe 
it is running out of code space, maybe something else is happening.

I'd strongly suggest building LLVM with assertions, if you haven't 
already, and reducing the size of the input to the JIT if possible (to 
reduce the amount of generated code you have to step through).

> I realize this description isn't effective as a repro-case, but I'm
> dealing with a fairly large system and before I go about hacking things
> apart to try and construct one I was hoping that some of you may have
> had experience with this kind of thing.  What tools do you all use to
> deal with crashes like these?  Is there a more robust alternative than
> XCode for dealing with low level debugging with on-the-fly generated
> code?  Have many changes occurred between 1.9 in this area that I should
> just abandon 1.9 in favor of the CVS sources?

There have been a ton of fixes and improvements in LLVM mainline.  I don't 
know if it would make sense for you to upgrade.  If you are early in the 
project, I would say yes, definitely upgrade.  If you're late in the 
project, I'd say no.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/