[LLVMdev] Question about callee saved registers in x86

Pasi Parviainen pasi.parviainen at iki.fi
Sat May 31 05:11:59 PDT 2014


On 31.5.2014 2:29, Sanjoy Das wrote:
> Hi Pasi,
>
> Do you have a broken test case lying around?  If you do, I'll start work on a fix for this using that as the test case.
>
> Thanks,
> -- Sanjoy

Yes I do, and here it's. As for the current state of affairs, case 'foo' 
works as a good control, only the output order of cfi_offset directives 
would change with the proposed changes. For the case 'bar', at the 
moment no cfi_offset directives are generated at all, and that should be 
fixed too. But with the case 'foobar' all the things breaks down 
currently and you get nonsensical output like:

foobar:                                 # @foobar
         .cfi_startproc
# BB#0:
         pushq   %rbp
.Ltmp8:
         .cfi_def_cfa_offset 16
         pushq   %r9
.Ltmp9:
         .cfi_def_cfa_offset 24
         pushq   %rbx
.Ltmp10:
         .cfi_def_cfa_offset 32
.Ltmp11:
         .cfi_offset %rbx, -80
.Ltmp12:
         .cfi_offset %r9, -72
.Ltmp13:
         .cfi_offset %rbp, -64
.Ltmp14:
         .cfi_offset %xmm0, -48
.Ltmp15:
         .cfi_offset %xmm7, -32
.Ltmp16:
         .cfi_offset %xmm15, -16
         movaps  %xmm15, -48(%rsp)       # 16-byte Spill
         movaps  %xmm7, -32(%rsp)        # 16-byte Spill
         movaps  %xmm0, -16(%rsp)        # 16-byte Spill

As can be seen here, offsets are in reverse order. For the proposed 
change output would be like this:

; CHECK: .cfi_offset %rbp, -16
; CHECK: .cfi_offset %r9, -24
; CHECK: .cfi_offset %rbx, -32
; CHECK: .cfi_offset %xmm15, -48
; CHECK: .cfi_offset %xmm7, -64
; CHECK: .cfi_offset %xmm0, -80

As can be seen here physical locations for xmm registers would change 
with the proposed change (reverse gpr and ymm/xmm orders within 
X86CallingConv.td). That means some tweaks to existing test cases. But 
what matters here is correctness of directives, and one dirty hack (as 
the comment states) less in X86FrameLowering::emitCalleeSavedFrameMoves.


Pasi

> On May 30, 2014, at 4:04 PM, Pasi Parviainen <pasi.parviainen at iki.fi> wrote:
>
>> On 28.5.2014 2:57, Sanjoy Das wrote:
>>> Hi llvmdev,
>>>
>>> I'm trying to figure how llvm remembers stack slots allotted to callee
>>> saved registers on x86.  In particular, llvm pushes registers in
>>> decreasing order of FrameIdxs [1], so the offsets they get (as
>>> returned by MFI->getObjectOffset) don't directly correspond to their
>>> actual stack locations.  In X86FrameLowering's
>>> emitCalleeSavedFrameMoves, when emitting DWARF information, this
>>> discrepancy gets fixed up by subtracting the offset reported by
>>> MFI->getObjectOffset from the minimum offset for any CSR (this is done
>>> by the "Offset = MaxOffset - Offset + saveAreaOffset;" line).  Is
>>> there a reason why llvm doesn't keep around the offsets in the right
>>> order from very beginning, by pushing the CSRs in increasing order of
>>> FrameIdxs?
>>
>> Now, that you mention it, I remember being down to the same rabbit hole. With certain calling conventions (coldcc, I think it was which can for sure expose this for x86), it is possible to generate invalid CFI directives for the registers in a frame. Especially when XMM registers must be preserved along with general purpose registers. And the reason for this was the offset fixing logic within emitCalleeSavedFrameMoves, which breaks when fixing offset for XMM registers.
>>
>> To fix this disparity, I concluded that it could be done by reversing definition order of general purpose registers within X86CallingConv.td for all calling conventions, since llvm prefers to use push/pop model for storing GPR:s (for x86). With this change stack slots and registers would have 1:1 mapping, without extra offset calculations and emitCalleeSavedFrameMoves could be simplified by removing extra magic to determine slots, and to generate correct CFI directives in unusual cases.
>>
>>> [1]: in fact, the way X86FrameLowering's spillCalleeSavedRegisters and
>>> PEI's calculateCalleeSavedRegisters are set up, I don't see a reason
>>> why the FrameIdxs and the generated push instructions have any
>>> relation at all.  It seems that the code relies on
>>> MFI->CreateStackObject returning sequential integers.
>>>
>>> Thanks!
>>> -- Sanjoy
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-------------- next part --------------
; RUN: llc < %s | FileCheck %s

target triple = "x86_64-linux-gnu"

; CHECK: foo:
; CHECK: .cfi_offset %rbp, -16
; CHECK: .cfi_offset %r9, -24
; CHECK: .cfi_offset %rbx, -32
define coldcc void @foo() {
  call void asm sideeffect "", "~{rbp},~{r9},~{rbx}"()
  ret void
}

; CHECK: bar:
; CHECK: .cfi_offset %xmm15, -32
; CHECK: .cfi_offset %xmm7, -48
; CHECK: .cfi_offset %xmm0, -64
define coldcc void @bar() {
  call void asm sideeffect "", "~{xmm15},~{xmm7},~{xmm0}"()
  ret void
}

; CHECK: foobar:
; CHECK: .cfi_offset %rbp, -16
; CHECK: .cfi_offset %r9, -24
; CHECK: .cfi_offset %rbx, -32
; CHECK: .cfi_offset %xmm15, -48
; CHECK: .cfi_offset %xmm7, -64
; CHECK: .cfi_offset %xmm0, -80
define coldcc void @foobar() {
  call void asm sideeffect "", "~{xmm15},~{xmm7},~{xmm0},~{rbp},~{r9},~{rbx}"()
  ret void
}


More information about the llvm-dev mailing list