[LLVMdev] Dynamic (JIT) type resolution

Mon Nov 5 21:26:44 PST 2007

Nicholas,

I guess you're trying to solve the fragile base-class problem by  
deferring field offset calculations until JIT compilation time?

Perhaps I'm missing something, but can't you accomplish this by using  
external constants in the source program, to be supplied at JIT/link  
time?

     external constant i32 @obj.x.offs;
     external constant i32 @obj.y.offs;

     define float @xPlusY(i8* %obj) {
     entry:
       %x.offs = load i32* @obj.x.offs;
       %x.ptr = getelementptr %obj, i32 %x.offs;
       %x.ptr2 = bitcast i8* %x.ptr to float*
       %x = load float* %x.ptr2
       %y.offs = load i32* @obj.y.offs;
       %y.ptr = getelementptr %obj, i32 %y.offs;
       %y.ptr2 = bitcast i8* %y.ptr to float*
       %y = load float* %y.ptr2
       %sum = add float %x, %y
       ret float %sum
     }

Or, quite similarly, accessor functions also to be supplied by the JIT/ 
linker:

     declare float @obj.x(i8* %obj)
     declare float @obj.y(i8* %obj)

     define float @xPlusY(i8* %obj) {
     entry:
       %x = call float @obj.x(i8* %obj);
       %y = call float @obj.y(i8* %obj);
       %sum = add float %x, %y
       ret float %sum
     }

In either case, an optimization pass could trivially zero out the  
overhead with no need to modify LLVM.

On 2007-11-05, at 23:27, Nicolas Geoffray wrote:

> Hi evaeryone,
>
> I would like to implement an equivalent mechanism of function  
> callbacks
> in the JIT, but for fields. Typically in Java, when you compile a
> method, there may be some instructions (getfield, putfield) that  
> depend
> on a type which is not yet resolved.
>
> I think the best way to do this in LLVM is to add an intrinsic. The
> intrinsic would be only valid if we jit, and would be lowered only in
> the codegen part (don't know yet if it would be in the target  
> dependent
> or target independent part).
>
> The callback method will resolve the type and patch the store/load
> instruction so that the correct address is used (exactly like the JIT
> function callback compiles a method and patch the call)
>
> Now there is one issue to deal with here: how to represent the
> intrinsic? It can either be 1) llvm.getfieldptr.{type} or 2) have two
> kinds of intrinsics llvm.getfield.{type} and llvm.storefield.{type}.
>
> I'd prefer using 1) since its closer to the LLVM instruction set
> (GetElementPtrInst), however there may be some tricky issues on where
> and how the  callback function must patch the code. For example, how  
> are
> move instructions (for spilling registers) inserted in LLVM? By  
> choosing
> 1), can I face the issue of having a move instruction between the
> getfieldptr call and the load/store? I probably can also face the
> problem of code optimization, where the store/load would not be next  
> to
> the callback call.
>
> Will I also have these issues with 2)? I don't know if LLVM does
> optimization on DAG nodes. The dag nodes that I would like to generate
> for a llvm.loadfield.{type} would be:
>
> DAG.getCall(FieldCallback); // Or something similar, I don't know
> exactly the syntax    ;-)
> DAG.getLoad();
>
> When (if possible) can I be sure that these two instructions are  
> next to
> each other in the native code?
>
> (Oh, and also, I would like codegen to not clobber caller-saved
> registers when doing the call. Is that even possible? This is just an
> optimization problem, so we can skip it for now).

— Gordon