[Lldb-commits] Instruction emulation of arm64 'stp d8, d9, [sp, #-0x70]!' style instruction

Tue Oct 11 18:15:26 PDT 2016

Hi Tamas, I'm writing some unit tests for the unwind source generators - x86 last week, arm64 this week, and I noticed with this prologue:

JavaScriptCore`JSC::B3::reduceDoubleToFloat:
    0x192b45c0c <+0>:  0x6db923e9   stp    d9, d8, [sp, #-0x70]!
    0x192b45c10 <+4>:  0xa9016ffc   stp    x28, x27, [sp, #0x10]
    0x192b45c14 <+8>:  0xa90267fa   stp    x26, x25, [sp, #0x20]
    0x192b45c18 <+12>: 0xa9035ff8   stp    x24, x23, [sp, #0x30]
    0x192b45c1c <+16>: 0xa90457f6   stp    x22, x21, [sp, #0x40]
    0x192b45c20 <+20>: 0xa9054ff4   stp    x20, x19, [sp, #0x50]
    0x192b45c24 <+24>: 0xa9067bfd   stp    x29, x30, [sp, #0x60]
    0x192b45c28 <+28>: 0x910183fd   add    x29, sp, #0x60            ; =0x60 
    0x192b45c2c <+32>: 0xd10a83ff   sub    sp, sp, #0x2a0            ; =0x2a0 

EmulateInstructionARM64::EmulateLDPSTP interprets this as a save of v31.  The use of reg 31 is an easy bug, the arm manual C7.2.284 ("STP (SIMD&FP)") gives us an "opc" (0b00 == 32-bit registers, 0b01 == 64-bit registers, 0b10 == 128-bit registers), an immediate value, and three registers (Rt2, Rn, Rt).  In the above example, these work out to Rt2 == 8 (d8), Rn == 31 ("sp"), Rt == 9 (d9).  The unwinder is incorrectly saying v31 right now because it's using Rn -

  if (vector) {
    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + n, reg_info_Rt))
      return false;
    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + n, reg_info_Rt2))
      return false;
  }

This would normally take up 32 bytes of stack space and cause big problems, but because we're writing the same reg twice, I think we luck out and only take 16 bytes of the stack.

We don't have dwarf register numbers for s0..31, d0..31, so we can't track this instruction's behavior 100% correctly but maybe if we said that 

That would be an easy fix, like

  if (vector) {
    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + t, reg_info_Rt))
      return false;
    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + t2, reg_info_Rt2))
      return false;
  }

We don't have dwarf register numbers for s0..31, d0..31, so I don't think we can correctly track this instruction's actions today.  Maybe we should put a save of v8 at CFA-112 and a save of v9 at CFA-104.  As long as the target is operating in little endian mode, when we go to get the contents of v8/v9 we're only actually USING the lower 64 bits so it'll work out, right?  I think I have that right.  We'll be reading garbage in the upper 64 bits - the register reading code won't have any knowledge of the fact that we only have the lower 32/64 bits available to us.

Throwing the problem out there, would like to hear what you think.  I don't want to encode buggy behavior in a unit test ;) so I'd like it for us to think about what correct behavior would be, and do that before I write the test.