[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT
Sean Silva via llvm-dev
llvm-dev at lists.llvm.org
Tue Jun 6 22:40:07 PDT 2017
Great work!
This is ready to post into a bug on llvm.org/bugs. If you're feeling a bit
adventurous, feel free to also try to debug it and post any clues (setting
breakpoints in the functions
in lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h is how I
would start).
Lang (CC'd) may have some other tips for where to look. (I'm actually not
very familiar with the JIT infrastructure myself, so take my advice with a
grain of salt)
-- Sean Silva
On Tue, Jun 6, 2017 at 10:30 PM, Nikodemus Siivola <
nikodemus at random-state.net> wrote:
> My code was hinky, but only in the sense that I was accidentally
> duplicating the definition variable in the module where the function was.
> With only the declaration in the second module loading the bitcode
> reproduces the issue.
>
> Managed an lli reproduction:
>
> $ cat jit-0.ll
> target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
> target triple = "i686-pc-windows-msvc"
>
> @foo = global { i8*, i32 } undef
>
> $ cat jit-1-clobber.ll
> target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
> target triple = "i686-pc-windows-msvc"
>
> @foo = external global { i8*, i32 }
>
> define void @setfoo() {
> entry:
> %p = inttoptr i32 42 to i8*
> store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
> @foo, i32 0, i32 0), align 4
> store i32 13, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
> @foo, i32 0, i32 1), align 4
> ret void
> }
>
> $ cat jit-1-noclobber.ll
> target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
> target triple = "i686-pc-windows-msvc"
>
> @foo = external global { i8*, i32 }
>
> define void @setfoo() {
> entry:
> %p = inttoptr i32 42 to i8*
> store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }*
> @foo, i32 0, i32 0), align 4
> ret void
> }
>
> $ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll
> -extra-module=jit-1-clobber.ll main.ll; echo $?
> 13
>
> $ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll
> -extra-module=jit-1-noclobber.ll main.ll; echo $?
> 42
>
> (Same happens with -jit-kind=mcjit.)
>
> Cheers,
>
> -- nikodemus
>
>
> On Wed, Jun 7, 2017 at 12:41 AM, Nikodemus Siivola <
> nikodemus at random-state.net> wrote:
>
>> I just managed a quick experiment today to dump and load the definition
>> of the variable and the function that sets it into separate modules.
>>
>> ...loading those bitcode files into separate modules (and handing those
>> modules to JIT) works as expected. What *should* be same code going
>> directly into JIT does not work.
>>
>> Which smells like the problem may be in my JIT hookup and not in
>> RuntimeDyld.
>>
>> I'll try to sort out my codepaths before digging into RuntimeDyld, so I
>> can be sure I'm doing same things in "live" JIT and when dumping/loading
>> bitcode.
>>
>> I'll let you know what turns up.
>>
>> Cheers,
>>
>> -- nikodemus
>>
>>
>> On Wed, Jun 7, 2017 at 12:16 AM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>> That's useful to know that the static compilation code path works.
>>> Furthermore, as expected from that:
>>>
>>> 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
>>> 00000054: IMAGE_REL_I386_DIR32 _foo
>>>
>>> It looks like the offset `4` of the second field of your struct is
>>> correct in the object file, so this does seem to be a problem in the
>>> JIT-specific linking/loading.
>>>
>>> Can you try debugging into lib/ExecutionEngine/Runti
>>> meDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is
>>> getting applied correctly in the context of your JIT?
>>>
>>> You may be able to repro this more easily using `lli`. It has a
>>> `-jit-kind` argument that should get you into the JIT codepath. (see
>>> test/ExecutionEngine/{MCJIT,ORCMCJIT}/)
>>>
>>> -- Sean Silva
>>>
>>>
>>> On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <
>>> nikodemus at random-state.net> wrote:
>>>
>>>> This is on Windows 10: didn't yet manage to get a 64-bit toolchain set
>>>> up that agreed on everything necessary.
>>>>
>>>> Dumped bitcode, but when I did that everything landed in the same
>>>> module (normally the global is defined in a different module then its uses)
>>>> --> the relocations are different... different enough that when I
>>>> loaded the bitcode back in and handed the single module to JIT it worked
>>>> fine.
>>>>
>>>> I'll try to dump a case where the definition is in a different module
>>>> tomorrow.
>>>>
>>>> Anyhow, below is what clang-cl turned the bitcode from my IR into --
>>>> probably not very useful though as this code does what it should...
>>>>
>>>> $ llvm-objdump.exe -r -d test.o
>>>>
>>>> test.o: file format COFF-i386
>>>>
>>>> Disassembly of section .text:
>>>> .text:
>>>> 0: 00 00 addb %al, (%eax)
>>>> 00000000: IMAGE_REL_I386_DIR32 _XEP:setfoo
>>>> 2: 00 00 addb %al, (%eax)
>>>>
>>>> _setfoo:
>>>> 4: 56 pushl %esi
>>>> 5: 83 ec 40 subl $64, %esp
>>>> 8: 89 e0 movl %esp, %eax
>>>> a: c7 00 00 00 00 00 movl $0, (%eax)
>>>> 0000000c: IMAGE_REL_I386_DIR32 _foo
>>>> 10: e8 00 00 00 00 calll 0 <_setfoo+0x11>
>>>> 00000011: IMAGE_REL_I386_REL32 _debugPointer
>>>> 15: 89 e1 movl %esp, %ecx
>>>> 17: c7 01 00 00 00 00 movl $0, (%ecx)
>>>> 00000019: IMAGE_REL_I386_DIR32 _foo
>>>> 1d: 89 44 24 3c movl %eax, 60(%esp)
>>>> 21: 89 54 24 38 movl %edx, 56(%esp)
>>>> 25: e8 00 00 00 00 calll 0 <_setfoo+0x26>
>>>> 00000026: IMAGE_REL_I386_REL32 _debugInt
>>>> 2a: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
>>>> 0000002c: IMAGE_REL_I386_DIR32 _foo
>>>> 00000030: IMAGE_REL_I386_DIR32 _JazzFixnumClass
>>>> 34: b9 00 00 00 00 movl $0, %ecx
>>>> 00000035: IMAGE_REL_I386_DIR32 _JazzFixnumClass
>>>> 39: 89 e6 movl %esp, %esi
>>>> 3b: c7 06 04 00 00 00 movl $4, (%esi)
>>>> 0000003d: IMAGE_REL_I386_DIR32 _foo
>>>> 41: 89 44 24 34 movl %eax, 52(%esp)
>>>> 45: 89 54 24 30 movl %edx, 48(%esp)
>>>> 49: 89 4c 24 2c movl %ecx, 44(%esp)
>>>> 4d: e8 00 00 00 00 calll 0 <_setfoo+0x4E>
>>>> 0000004e: IMAGE_REL_I386_REL32 _debugInt
>>>> 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
>>>> 00000054: IMAGE_REL_I386_DIR32 _foo
>>>> 5c: 89 e1 movl %esp, %ecx
>>>> 5e: c7 01 00 00 00 00 movl $0, (%ecx)
>>>> 00000060: IMAGE_REL_I386_DIR32 _foo
>>>> 64: 89 44 24 28 movl %eax, 40(%esp)
>>>> 68: 89 54 24 24 movl %edx, 36(%esp)
>>>> 6c: e8 00 00 00 00 calll 0 <_setfoo+0x6D>
>>>> 0000006d: IMAGE_REL_I386_REL32 _debugPointer
>>>> 71: c7 05 00 00 00 00 00 00 00 00 movl $0, 0
>>>> 00000073: IMAGE_REL_I386_DIR32 _foo
>>>> 00000077: IMAGE_REL_I386_DIR32 _JazzFixnumClass
>>>> 7b: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
>>>> 0000007d: IMAGE_REL_I386_DIR32 _foo
>>>> 85: 89 e1 movl %esp, %ecx
>>>> 87: c7 01 00 00 00 00 movl $0, (%ecx)
>>>> 00000089: IMAGE_REL_I386_DIR32 _foo
>>>> 8d: 89 44 24 20 movl %eax, 32(%esp)
>>>> 91: 89 54 24 1c movl %edx, 28(%esp)
>>>> 95: e8 00 00 00 00 calll 0 <_setfoo+0x96>
>>>> 00000096: IMAGE_REL_I386_REL32 _debugPointer
>>>> 9a: 89 e1 movl %esp, %ecx
>>>> 9c: c7 41 08 d5 00 00 00 movl $213, 8(%ecx)
>>>> a3: c7 41 04 00 00 00 00 movl $0, 4(%ecx)
>>>> 000000a6: IMAGE_REL_I386_DIR32 _JazzFixnumClass
>>>> aa: c7 01 00 00 00 00 movl $0, (%ecx)
>>>> 000000ac: IMAGE_REL_I386_DIR32 _foo
>>>> b0: 89 44 24 18 movl %eax, 24(%esp)
>>>> b4: 89 54 24 14 movl %edx, 20(%esp)
>>>> b8: e8 00 00 00 00 calll 0 <_setfoo+0xB9>
>>>> 000000b9: IMAGE_REL_I386_REL32 _setGlobal
>>>> bd: 89 e0 movl %esp, %eax
>>>> bf: c7 00 00 00 00 00 movl $0, (%eax)
>>>> 000000c1: IMAGE_REL_I386_DIR32 _foo
>>>> c5: e8 00 00 00 00 calll 0 <_setfoo+0xC6>
>>>> 000000c6: IMAGE_REL_I386_REL32 _debugPointer
>>>> ca: b9 d5 00 00 00 movl $213, %ecx
>>>> cf: 8b 74 24 2c movl 44(%esp), %esi
>>>> d3: 89 44 24 10 movl %eax, 16(%esp)
>>>> d7: 89 f0 movl %esi, %eax
>>>> d9: 89 54 24 0c movl %edx, 12(%esp)
>>>> dd: 89 ca movl %ecx, %edx
>>>> df: 83 c4 40 addl $64, %esp
>>>> e2: 5e popl %esi
>>>> e3: c3 retl
>>>> e4: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw
>>>> %cs:(%eax,%eax)
>>>>
>>>> _XEP:setfoo:
>>>> f0: 8b 44 24 04 movl 4(%esp), %eax
>>>> f4: 83 f8 00 cmpl $0, %eax
>>>> f7: 0f 84 05 00 00 00 je 5 <_XEP:setfoo+0x12>
>>>> fd: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x12>
>>>> 000000fe: IMAGE_REL_I386_REL32 _typeError
>>>> 102: e8 00 00 00 00 calll 0 <_XEP:setfoo+0x17>
>>>> 00000103: IMAGE_REL_I386_REL32 _setfoo
>>>> 107: c3 retl
>>>> 108: 0f 1f 84 00 00 00 00 00 nopl (%eax,%eax)
>>>> 110: 00 00 addb %al, (%eax)
>>>> 00000110: IMAGE_REL_I386_DIR32 _XEP:getfoo
>>>> 112: 00 00 addb %al, (%eax)
>>>>
>>>> _getfoo:
>>>> 114: 50 pushl %eax
>>>> 115: 89 e0 movl %esp, %eax
>>>> 117: c7 00 00 00 00 00 movl $0, (%eax)
>>>> 00000119: IMAGE_REL_I386_DIR32 _foo
>>>> 11d: e8 00 00 00 00 calll 0 <_getfoo+0xE>
>>>> 0000011e: IMAGE_REL_I386_REL32 _getGlobal
>>>> 122: 59 popl %ecx
>>>> 123: c3 retl
>>>> 124: 66 66 66 2e 0f 1f 84 00 00 00 00 00 nopw
>>>> %cs:(%eax,%eax)
>>>>
>>>> _XEP:getfoo:
>>>> 130: 8b 44 24 04 movl 4(%esp), %eax
>>>> 134: 83 f8 00 cmpl $0, %eax
>>>> 137: 0f 84 05 00 00 00 je 5 <_XEP:getfoo+0x12>
>>>> 13d: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x12>
>>>> 0000013e: IMAGE_REL_I386_REL32 _typeError
>>>> 142: e8 00 00 00 00 calll 0 <_XEP:getfoo+0x17>
>>>> 00000143: IMAGE_REL_I386_REL32 _getfoo
>>>> 147: c3 retl
>>>>
>>>>
>>>> On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <
>>>>> nikodemus at random-state.net> wrote:
>>>>>
>>>>>> Uh. Turns out that if I hide the pointer to @foo from LLVM by passing
>>>>>> it through an opaque identity function ... then everything works fine.
>>>>>>
>>>>>> Is this a bug in LLVM or is there some magic involving globals I'm
>>>>>> misunderstanding?
>>>>>>
>>>>>
>>>>> This looks like a bug in the handling of constant GEP's. Specifically
>>>>> the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32
>>>>> 1)` used to calculate the address of the integer inside the struct. Your
>>>>> observation "The bizarre thing is that even this looks correct: the
>>>>> debugInt is called first with @foo, then @foo+4, and the stores seem to be
>>>>> going to the right addresses as well: @foo and @foo+4!" at the level of the
>>>>> MachineInstr dump rules out problems before that.
>>>>>
>>>>> After MachineInstr comes MC to emit the object file, but `foo+4` is
>>>>> one of the most basic relocation types, so I doubt that there's a bug in
>>>>> the lowering there or else "everything" would be broken.
>>>>> Just to verify though, checking assembly of a small example across
>>>>> 32-bit targets of all 3 object file formats looks fine at a glance (MC is
>>>>> getting the +4 addend, though you would need to run `llvm-objdump -d -r` to
>>>>> see the actual relocation in the binary) .
>>>>> https://godbolt.org/g/0Owzf5
>>>>> https://godbolt.org/g/n0qzmg
>>>>> https://godbolt.org/g/kAOvkQ
>>>>>
>>>>> Beyond MC, you already have your static object file. If that is fine,
>>>>> then in a JIT context you might be running into issues with
>>>>> RuntimeDyld. The actual GEP's that clang generates are identical to the
>>>>> ones in your code, further suggesting that this is JIT specific and that
>>>>> static links are unaffected (if you could verify that, it would help to
>>>>> narrow down the possibilities).
>>>>> Maybe look at the output of `llvm-objdump -d -r` on a static .o file
>>>>> generated from your IR and see where the relocation is handled
>>>>> in lib/ExecutionEngine/RuntimeDyld (this will depend on your
>>>>> platform; grepping for the name of the relocation shown by llvm-objdump
>>>>> should find the right code to look at).
>>>>>
>>>>> By the way, what platform are you JIT'ing on? I noticed that it is a
>>>>> 32-bit target, and I suspect that the 32-bit support in the JIT
>>>>> infrastructure isn't as well tested / commonly used as the 64-bit code,
>>>>> possibly explaining why this sort of bug could sneak through.
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)*
>>>>>> @"XEP:__anonToplevel/0" {
>>>>>> entry:
>>>>>> %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
>>>>>> %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>>>>> %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
>>>>>> %3 = ptrtoint { i8*, i32 }* %0 to i32
>>>>>> %4 = call { i8*, i32 } @debugInt(i32 %3)
>>>>>> store i8* @FixnumClass, i8** %2, align 4
>>>>>> %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
>>>>>> %6 = ptrtoint i32* %5 to i32
>>>>>> %7 = call { i8*, i32 } @debugInt(i32 %6)
>>>>>> store i32 123, i32* %5, align 4
>>>>>> %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>>>>> store i8* @FixnumClass, i8** %2, align 4
>>>>>> store i32 123, i32* %5, align 4
>>>>>> %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>>>>> call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8*
>>>>>> @FixnumClass, i32 123 })
>>>>>> %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
>>>>>> ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>>>>>> }
>>>>>>
>>>>>> Output, now with correct addresses out of the GEPs, and memory being
>>>>>> modified as expected:
>>>>>>
>>>>>> p = 02F80000
>>>>>> class: 00000000
>>>>>> datum: 00000000
>>>>>> x = 02F80000
>>>>>> x = 02F80004
>>>>>> p = 02F80000
>>>>>> class: 028D3E98
>>>>>> datum: 0000007B
>>>>>> p = 02F80000
>>>>>> class: 028D3E98
>>>>>> datum: 0000007B
>>>>>> p = 02F80000
>>>>>> class: 028D3E98
>>>>>> datum: 0000007B
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> -- nikodemus
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <
>>>>>> nikodemus at random-state.net> wrote:
>>>>>>
>>>>>>> Since the getelementptrs were implicitly generated by the
>>>>>>> CreateStore/Load I'm not sure how to get access to them.
>>>>>>>
>>>>>>> So I hacked the assignment to be done thrice: once using a manual
>>>>>>> decomposition into two GEPs and stores, once using the "big" CreateStore,
>>>>>>> once via the setGlobal function, printing addresses and memory contents at
>>>>>>> each point to the degree that I have access to them.
>>>>>>>
>>>>>>> It seems the following GEPs compute the same address?! I can buy
>>>>>>> myself not understanding how GEP works and doing it wrong, but
>>>>>>> builder.CreateStore() creates what look like identical GEPs implicitly...
>>>>>>>
>>>>>>> i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32
>>>>>>> 0, i32 0), align 4
>>>>>>> i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32
>>>>>>> 0, i32 1), align 4
>>>>>>>
>>>>>>> The details.
>>>>>>>
>>>>>>> This is the relevant part from my codegen:
>>>>>>>
>>>>>>> auto ty = val->getType();
>>>>>>> cout << "val type:" << endl;
>>>>>>> ty->dump();
>>>>>>> cout << "ptr type:" << endl;
>>>>>>> ptr->getType()->dump();
>>>>>>> // Print memory
>>>>>>> ctx.EmitCall1("debugPointer", ptr);
>>>>>>> // Set class pointer
>>>>>>> auto c = ctx.bld.CreateExtractValue(val, 0, "class");
>>>>>>> auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
>>>>>>> auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
>>>>>>> ctx.EmitCall1("debugInt", cx);
>>>>>>> ctx.bld.CreateStore(c, cp);
>>>>>>> // Set datum
>>>>>>> auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
>>>>>>> auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
>>>>>>> auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
>>>>>>> ctx.EmitCall1("debugInt", dx);
>>>>>>> ctx.bld.CreateStore(d, dp);
>>>>>>> // Print memory
>>>>>>> ctx.EmitCall1("debugPointer", ptr);
>>>>>>> // Do the same with a single store
>>>>>>> ctx.bld.CreateStore(val, ptr);
>>>>>>> // Print memory
>>>>>>> ctx.EmitCall1("debugPointer", ptr);
>>>>>>> // Call out
>>>>>>> ctx.EmitCall2("setGlobal", ptr, val);
>>>>>>> // Print memory
>>>>>>> ctx.EmitCall1("debugPointer", ptr);
>>>>>>>
>>>>>>> Here is the compile-time output showing types of the value and the
>>>>>>> pointer:
>>>>>>>
>>>>>>> val type:
>>>>>>> { i8*, i32 }
>>>>>>> ptr type:
>>>>>>> { i8*, i32 }*
>>>>>>>
>>>>>>> Here is the IR dump for the function (after a couple of passes),
>>>>>>> right before it's fed to the JIT:
>>>>>>>
>>>>>>> define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)*
>>>>>>> @"XEP:__anonToplevel/0" {
>>>>>>> entry:
>>>>>>> %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>>>>>> %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo
>>>>>>> to i32))
>>>>>>> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 },
>>>>>>> { i8*, i32 }* @foo, i32 0, i32 0), align 4
>>>>>>> %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr
>>>>>>> inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
>>>>>>> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*,
>>>>>>> i32 }* @foo, i32 0, i32 1), align 4
>>>>>>> %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>>>>>> store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 },
>>>>>>> { i8*, i32 }* @foo, i32 0, i32 0), align 4
>>>>>>> store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*,
>>>>>>> i32 }* @foo, i32 0, i32 1), align 4
>>>>>>> %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>>>>>> call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } {
>>>>>>> i8* @FixnumClass, i32 123 })
>>>>>>> %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
>>>>>>> ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
>>>>>>> }
>>>>>>>
>>>>>>> ​Here is the runtime from calling the JITed function, including
>>>>>>> memory addresses and contents, with my annotations:
>>>>>>>
>>>>>>> # Before
>>>>>>> p = 03C10000
>>>>>>> class: 00000000
>>>>>>> datum: 00000000
>>>>>>> # Should be address of the class slot --> correct
>>>>>>> x = 03C10000
>>>>>>> # Should be address of the datum slot, ie address of class slot + 4
>>>>>>> --> incorrect
>>>>>>> x = 03C10000
>>>>>>> # Yeah, both values want to class slot, so actual class pointer got
>>>>>>> clobbered
>>>>>>> p = 03C10000
>>>>>>> class: 0000007B
>>>>>>> datum: 00000000
>>>>>>> # Same result from the single CreateStore
>>>>>>> p = 03C10000
>>>>>>> class: 0000007B
>>>>>>> datum: 00000000
>>>>>>> # Calling out to setGlobal as in my first email works
>>>>>>> p = 03C10000
>>>>>>> class: 039D2E98
>>>>>>> datum: 0000007B
>>>>>>>
>>>>>>> Finally, I didn't manage nice disassembly yet, so here is the last
>>>>>>> output from --print-after-all for the function. The bizarre thing is that
>>>>>>> even this looks correct: the debugInt is called first with @foo, then
>>>>>>> @foo+4, and the stores seem to be going to the right addresses as well:
>>>>>>> @foo and @foo+4!
>>>>>>>
>>>>>>> BB#0: derived from LLVM BB %entry
>>>>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL
>>>>>>> %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>>>>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX
>>>>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>>>>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>>>>>>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32
>>>>>>> }, { i8*, i32 }* @foo, i32 0, i32 0)]
>>>>>>> PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX
>>>>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>>>>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123;
>>>>>>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0,
>>>>>>> i32 1)]
>>>>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL
>>>>>>> %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>>>>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg,
>>>>>>> <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32
>>>>>>> }, { i8*, i32 }* @foo, i32 0, i32 0)]
>>>>>>> MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123;
>>>>>>> mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0,
>>>>>>> i32 1)]
>>>>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL
>>>>>>> %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>>>>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX
>>>>>>> %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL
>>>>>>> %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>,
>>>>>>> %EAX<imp-def,dead>, %EDX<imp-def,dead>
>>>>>>> %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4,
>>>>>>> %EFLAGS<imp-def,dead>
>>>>>>> CFI_INSTRUCTION <call frame instruction>
>>>>>>> %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
>>>>>>> %EDX<def> = MOV32ri 123
>>>>>>> RETL %EAX<kill>, %EDX<kill>
>>>>>>>
>>>>>>> Also, I have essentially identical code working perfectly fine when
>>>>>>> the memory being written to is from @alloca.
>>>>>>>
>>>>>>> I am completely clueless. Any suggestions most welcome.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> -- nikodemus
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/946698e3/attachment.html>
More information about the llvm-dev
mailing list