[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Mon Jun 5 17:18:39 PDT 2017


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <
nikodemus at random-state.net> wrote:

> Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it
> through an opaque identity function ... then everything works fine.
>
> Is this a bug in LLVM or is there some magic involving globals I'm
> misunderstanding?
>

This looks like a bug in the handling of constant GEP's. Specifically the
`getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)`
used to calculate the address of the integer inside the struct. Your
observation "The bizarre thing is that even this looks correct: the
debugInt is called first with @foo, then @foo+4, and the stores seem to be
going to the right addresses as well: @foo and @foo+4!" at the level of the
MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of
the most basic relocation types, so I doubt that there's a bug in the
lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit
targets of all 3 object file formats looks fine at a glance (MC is getting
the +4 addend, though you would need to run `llvm-objdump -d -r` to see the
actual relocation in the binary) .
https://godbolt.org/g/0Owzf5
https://godbolt.org/g/n0qzmg
https://godbolt.org/g/kAOvkQ

Beyond MC, you already have your static object file. If that is fine, then
in a JIT context you might be running into issues with RuntimeDyld. The
actual GEP's that clang generates are identical to the ones in your code,
further suggesting that this is JIT specific and that static links are
unaffected (if you could verify that, it would help to narrow down the
possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file
generated from your IR and see where the relocation is handled
in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform;
grepping for the name of the relocation shown by llvm-objdump should find
the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit
target, and I suspect that the 32-bit support in the JIT infrastructure
isn't as well tested / commonly used as the 64-bit code, possibly
explaining why this sort of bug could sneak through.

-- Sean Silva


>
> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)*
> @"XEP:__anonToplevel/0" {
> entry:
>   %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
>   %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>   %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
>   %3 = ptrtoint { i8*, i32 }* %0 to i32
>   %4 = call { i8*, i32 } @debugInt(i32 %3)
>   store i8* @FixnumClass, i8** %2, align 4
>   %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
>   %6 = ptrtoint i32* %5 to i32
>   %7 = call { i8*, i32 } @debugInt(i32 %6)
>   store i32 123, i32* %5, align 4
>   %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>   store i8* @FixnumClass, i8** %2, align 4
>   store i32 123, i32* %5, align 4
>   %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>   call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass,
> i32 123 })
>   %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>   ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
> }
>
> Output, now with correct addresses out of the GEPs, and memory being
> modified as expected:
>
> p = 02F80000
>   class: 00000000
>   datum: 00000000
> x = 02F80000
> x = 02F80004
> p = 02F80000
>   class: 028D3E98
>   datum: 0000007B
> p = 02F80000
>   class: 028D3E98
>   datum: 0000007B
> p = 02F80000
>   class: 028D3E98
>   datum: 0000007B
>
> Cheers,
>
>  -- nikodemus
>
>
> On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <
> nikodemus at random-state.net> wrote:
>
>> Since the getelementptrs were implicitly generated by the
>> CreateStore/Load I'm not sure how to get access to them.
>>
>> So I hacked the assignment to be done thrice: once using a manual
>> decomposition into two GEPs and stores, once using the "big" CreateStore,
>> once via the setGlobal function, printing addresses and memory contents at
>> each point to the degree that I have access to them.
>>
>> It seems the following GEPs compute the same address?! I can buy myself
>> not understanding how GEP works and doing it wrong, but
>> builder.CreateStore() creates what look like identical GEPs implicitly...
>>
>> i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32
>> 0), align 4
>> i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32
>> 1), align 4
>>
>> The details.
>>
>> This is the relevant part from my codegen:
>>
>>             auto ty = val->getType();
>>             cout << "val type:" << endl;
>>             ty->dump();
>>             cout << "ptr type:" << endl;
>>             ptr->getType()->dump();
>>             // Print memory
>>             ctx.EmitCall1("debugPointer", ptr);
>>             // Set class pointer
>>             auto c = ctx.bld.CreateExtractValue(val, 0, "class");
>>             auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
>>             auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
>>             ctx.EmitCall1("debugInt", cx);
>>             ctx.bld.CreateStore(c, cp);
>>             // Set datum
>>             auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
>>             auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
>>             auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
>>             ctx.EmitCall1("debugInt", dx);
>>             ctx.bld.CreateStore(d, dp);
>>             // Print memory
>>             ctx.EmitCall1("debugPointer", ptr);
>>             // Do the same with a single store
>>             ctx.bld.CreateStore(val, ptr);
>>             // Print memory
>>             ctx.EmitCall1("debugPointer", ptr);
>>             // Call out
>>             ctx.EmitCall2("setGlobal", ptr, val);
>>             // Print memory
>>             ctx.EmitCall1("debugPointer", ptr);
>>
>> Here is the compile-time output showing types of the value and the
>> pointer:
>>
>> val type:
>> { i8*, i32 }
>> ptr type:
>> { i8*, i32 }*
>>
>> Here is the IR dump for the function (after a couple of passes), right
>> before it's fed to the JIT:
>>
>> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)*
>> @"XEP:__anonToplevel/0" {
>> entry:
>>   %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>   %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to
>> i32))
>>   store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, {
>> i8*, i32 }* @foo, i32 0, i32 0), align 4
>>   %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr
>> inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
>>   store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
>> @foo, i32 0, i32 1), align 4
>>   %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>   store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, {
>> i8*, i32 }* @foo, i32 0, i32 0), align 4
>>   store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
>> @foo, i32 0, i32 1), align 4
>>   %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>   call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8*
>> @FixnumClass, i32 123 })
>>   %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>   ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>> }
>>
>> ​Here is the runtime from calling the JITed function, including memory
>> addresses and contents, with my annotations:
>>
>> # Before
>> p = 03C10000
>>   class: 00000000
>>   datum: 00000000
>> # Should be address of the class slot --> correct
>> x = 03C10000
>> # Should be address of the datum slot, ie address of class slot + 4 -->
>> incorrect
>> x = 03C10000
>> # Yeah, both values want to class slot, so actual class pointer got
>> clobbered
>> p = 03C10000
>>   class: 0000007B
>>   datum: 00000000
>> # Same result from the single CreateStore
>> p = 03C10000
>>   class: 0000007B
>>   datum: 00000000
>> # Calling out to setGlobal as in my first email works
>> p = 03C10000
>>   class: 039D2E98
>>   datum: 0000007B
>>
>> Finally, I didn't manage nice disassembly yet, so here is the last output
>> from --print-after-all for the function. The bizarre thing is that even
>> this looks correct: the debugInt is called first with @foo, then @foo+4,
>> and the stores seem to be going to the right addresses as well: @foo and
>> @foo+4!
>>
>> BB#0: derived from LLVM BB %entry
>>         PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX
>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI
>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, {
>> i8*, i32 }* @foo, i32 0, i32 0)]
>>         PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI
>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123;
>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0,
>> i32 1)]
>>         PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX
>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, {
>> i8*, i32 }* @foo, i32 0, i32 0)]
>>         MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123;
>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0,
>> i32 1)]
>>         PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX
>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI
>> %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>         CFI_INSTRUCTION <call frame instruction>
>>         CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX
>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>         %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
>>         CFI_INSTRUCTION <call frame instruction>
>>         %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
>>         %EDX<def> = MOV32ri 123
>>         RETL %EAX<kill>, %EDX<kill>
>>
>> Also, I have essentially identical code working perfectly fine when the
>> memory being written to is from @alloca.
>>
>> I am completely clueless. Any suggestions most welcome.
>>
>> Cheers,
>>
>>  -- nikodemus
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170605/32f1e42f/attachment.html>


More information about the llvm-dev mailing list