[llvm-dev] Some strange i64 behavior with arm 32bit. (Raspberry Pi)

Moritz Angermann via llvm-dev llvm-dev at lists.llvm.org
Sun Dec 3 05:52:51 PST 2017


Ok...

after some more digging it turned out that the underlying issue was a bug in my
code generator. For the record I'll just note down the issue.

My code generator generated /unpacked/ structs for simplicity reasons, and because
I though--incorrectly--that we (GHC) generated GEP accessors.  We don't!  GHC
computes absolute offsets into those structs, as such generating /unpacked/
structs (e.g. { i32, i64 }, does not guarantee that the i64 is at offset +4; there
might be padding) is futile and all I needed to change was to generate packed
instead of unpacked structs.

However I still believe that the code gen for the C to STG bridge should add an
`sub sp, sp, 4` line to the inline assembly *if* it emits the `vstmdb sp!, {d8-d11}`
part, to ensure that the stack is 8byte aligned.

Thank you.

Cheers,
 Moritz

> On Dec 3, 2017, at 3:26 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote:
> 
> Alright, so after some more debugging (injeting print statements at the llvm ir level),
> I came across the following:
> 
> GHC has the following code for the C into STG and back bridge: `RunStg`, which is defined
> in https://github.com/ghc/ghc/blob/master/rts/StgCRun.c; the resulting llvm ir ends up being:
> 
> ```
> ; Function Attrs: nounwind
> define hidden %struct.StgRegTable* @StgRun(i8* ()* ()*, %struct.StgRegTable*) local_unnamed_addr #0 {
> 
>  %3 = tail call %struct.StgRegTable* asm sideeffect "stmfd sp!, {r4-r11, ip, lr}\0A\09vstmdb sp!, {d8-d11}\0A\09sub sp, sp, $3\0A\09mov r4, $2\0A\09bx $1\0A\09.globl StgReturn\0A\09.type StgReturn, %function\0AStgReturn:\0A\09add sp, sp, $3\0A\09mov $0, r7\0A\09vldmia sp!, {d8-d11}\0A\09ldmfd sp!, {r4-r11, ip, lr}\0A\09", "=r,r,r,i,~{r4},~{r5},~{r6},~{r7},~{r8},~{r9},~{r10},~{r12},~{lr}"(i8* ()* ()* %0, %struct.StgRegTable* %1, i32 8192) #1, !srcloc !3
> 
>  ret %struct.StgRegTable* %3
> }
> ```
> 
> The assembly for better readability reads:
> 
>  stmfd sp!, {r4-r11, ip, lr}
>  vstmdb sp!, {d8-d11}
>  sub sp, sp, $3
>  mov r4, $2
>  bx $1
> .globl StgReturn
> .type StgReturn, %function
> StgReturn:
>  add sp, sp, $3
>  mov $0, r7
>  vldmia sp!, {d8-d11}
>  ldmfd sp!, {r4-r11, ip, lr}
> 
> And when this results in the following assembly being emitted (for armv-unknown-linux-gnueabihf):
> 
> ```
> 00000074 <StgRun>:
>  74:   e92d4ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
>  78:   e28db01c        add     fp, sp, #28, 0
>  7c:   e92d5ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
>  80:   ed2d8b08        vpush   {d8-d11}
>  84:   e24dda02        sub     sp, sp, #8192   ; 0x2000
>  88:   e1a04001        mov     r4, r1
>  8c:   e12fff10        bx      r0
> 
> 00000090 <StgReturn>:
>  90:   e28dda02        add     sp, sp, #8192   ; 0x2000
>  94:   e1a00007        mov     r0, r7
>  98:   ecbd8b08        vpop    {d8-d11}
>  9c:   e8bd5ff0        pop     {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
>  a0:   e8bd8ff0        pop     {r4, r5, r6, r7, r8, r9, sl, fp, pc}
> ```
> 
> By adding extra ptinf statements, I found out that adding a `printf` statement after the assembly and before
> the `ret`, the generated code looks slightly different:
> 
> ```
> 00000074 <StgRun>:
>  74:   e92d4ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, lr}
>  78:   e28db01c        add     fp, sp, #28, 0
>  7c:   e24dd004        sub     sp, sp, #4, 0
>  80:   e92d5ff0        push    {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
>  84:   ed2d8b08        vpush   {d8-d11}
>  88:   e24dda02        sub     sp, sp, #8192   ; 0x2000
>  8c:   e1a04001        mov     r4, r1
>  90:   e12fff10        bx      r0
> 
> 00000094 <StgReturn>:
>  94:   e28dda02        add     sp, sp, #8192   ; 0x2000
>  98:   e1a00007        mov     r0, r7
>  9c:   ecbd8b08        vpop    {d8-d11}
>  a0:   e8bd5ff0        pop     {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
>  a4:   e58d0000        str     r0, [sp]
>  a8:   e3a00002        mov     r0, #2, 0
>  ac:   ebfffffe        bl      44 <.LdebugEnd>
>  b0:   e59d0000        ldr     r0, [sp]
>  b4:   e24bd01c        sub     sp, fp, #28, 0
>  b8:   e8bd8ff0        pop     {r4, r5, r6, r7, r8, r9, sl, fp, pc}
> ```
> 
> and we can see that an additional `sp = sp - 4` was added.
> 
> With the log statement in StgRun, subsequent log statements so far work.
> 
> Now I wonder
>  a) could I write this logic in llvm ir directly,
>     without having to resort to assembly?
>  b) could I force llvm to emit 32 instead of 28 somehow? to make sure
>     my sp is 8byte aligned?
> 
> Of course I'm happy to take any other ideas as well.
> 
> Cheers,
> Moritz
> 
>> On Dec 1, 2017, at 6:30 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote:
>> 
>> Hi Tim,
>> thanks for the swift response!
>> 
>> @debug is defined in the same module, which makes this all the more confusing.
>> 
>> 
>> The target information from the working example are:
>> target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
>> target triple = "armv6kz--linux-gnueabihf"
>> 
>> 
>> from the ghc produced module:
>> target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
>> target triple = "arm-unknown-linux-gnueabihf"
>> 
>> However there ones more thing, I could think of, arm does allow mixed mode
>> I believe. And as such as the code from the ghc produced module is called
>> from outside of the module, could the endianness be set there prior to
>> entering the function?
>> 
>> The working module contains the main directly and is not called from a main
>> function in a different module.
>> 
>> I've also tried to define a regular c function with the same code and called
>> that from within the ghccc function with the same (incorrect) results.
>> 
>> Any further ideas I could expore?
>> 
>> 
>> Cheers,
>> Moritz
>> 
>>> On Dec 1, 2017, at 4:26 PM, Tim Northover <t.p.northover at gmail.com> wrote:
>>> 
>>> Hi Moritz,
>>>> If someone could offer some hint, where to look further for debugging this, I'd very much appreciate the advice!
>>>> I'm a bit lost right now how to figure out why I end up getting swapped words.
>>> 
>>> If one file was compiled for big-endian ARM and the other for
>>> little-endian that could be the result. I'm not aware of any other
>>> possible cause and from local tests I don't think the "ghccc" alone
>>> explains the difference.
>>> 
>>> So maybe some glitch in how GHC was configured on your system? What's
>>> the triple at the top of the GHC module and the module containing the
>>> definition of @debug?
>>> 
>>> Cheers.
>>> 
>>> Tim.
>> 
> 



More information about the llvm-dev mailing list