[Lldb-commits] Instruction emulation of arm64 'stp d8, d9, [sp, #-0x70]!' style instruction

Tamas Berghammer via lldb-commits lldb-commits at lists.llvm.org
Thu Oct 13 08:04:42 PDT 2016


In case of Linux and Android we are using the qRegisterInfo packet and
lldb-server fills it in based on the register definitions inside LLDB so
for those targets it would be important to have all of the alias registers
available.

I don't have an AArch64-BE target at hand but I am pretty sure you are
right about the different register offsets. To handle both big and little
endian I think we will either have to create 2 separate copy of the
RegisterInfos or make it endian aware to correctly represent the offsets
because of the different offsets. I am pretty sure we currently handle it
incorrectly in case of ARM (where we have the alias registers) as we don't
have separate register contexts so if we find a way to fix the AArch64 case
then we should fix ARM as well. My preferred solution would be try to clean
up the large amount of code duplication from the RegisterInfo files and
then create correct data including all alias registers and endian aware
offsets using runtime loops instead of the current static arrays.

What do you think?

Tamas

On Thu, Oct 13, 2016 at 2:53 AM Jason Molenda <jmolenda at apple.com> wrote:

But if I attach to an arm64 core file, where lldb is using it's own
register definitions, then lldb has no idea what s0 is.

My concern about defining these subset registers in RegisterInfos_arm64.h
is that the offsets are in a target-endian register context buffer.  My
example below was little-endian so the start of v0 was 268 bytes into the
register file and the offset of s0 is also 268 bytes.  But in a BE target,
I think s0 would be at offset 280 in the buffer.

> On Oct 12, 2016, at 6:38 PM, Jason Molenda <jmolenda at apple.com> wrote:
>
> Yeah, it's incorrectly grabbing the stack pointer reg number (31) from
the Rn bits and using that as the register # being saved, instead of the Rt
and Rt2 register numbers, and saying that v31 is being pushed twice.  It's
an easy bug in EmulateInstructionARM64::EmulateLDPSTP but fixing that just
presents us with the next problem so I haven't fixed it.
>
> When I connect to debugserver on an arm64 device, it tells lldb about the
pseudo regs on the device, like
>
>  <reg name="x0" regnum="0" offset="0" bitsize="64" group="general"
altname="arg1" group_id="1" ehframe_regnum="0" dwarf_regnum="0"
generic="arg1" invalidate_regnums="0,34"/>
>
>  <reg name="w0" regnum="34" offset="0" bitsize="32" group="general"
group_id="1" value_regnums="0" invalidate_regnums="0,34"/>
>
>  <reg name="v0" regnum="63" offset="268" bitsize="128" group="vector"
type="float" altname="q0" encoding="vector" format="vector-uint8"
group_id="2" dwarf_regnum="64" invalidate_regnums="63,129,97"/>
>
>  <reg name="s0" regnum="97" offset="268" bitsize="32" group="float"
type="float" encoding="ieee754" format="float" group_id="2"
value_regnums="63" invalidate_regnums="63,129,97"/>
>
>  <reg name="d0" regnum="129" offset="268" bitsize="64" group="float"
type="float" encoding="ieee754" format="float" group_id="2"
value_regnums="63" invalidate_regnums="63,129,97"/>
>
>
> (this is the reply to the "qXfer:features:read:target.xml" packet that
lldb sends to debugserver at the startup of the debug session) so the user
can refer to these by name:
>
> (lldb) reg read -f x s0
>      s0 = 0x404e809d
> (lldb) reg read -f x d0
>      d0 = 0x41bdaf19404e809d
> (lldb) reg read -f x v0
>      v0 = 0x000000000000000041bdaf19404e809d
> (lldb)
>
>
> J
>
>> On Oct 12, 2016, at 6:23 PM, Tamas Berghammer <tberghammer at google.com>
wrote:
>>
>> I am a bit surprised that we don't define the smaller floating point
registers on AArch64 the same way we do it on x86/x86_64/Arm (have no idea
about MIPS) and I would consider this as a bug what we should fix (will
gave it a try when I have a few free cycles) because currently it is very
difficult to get a 64bit double value out of the data we are displaying.
>>
>> Considering the fact that based on the ABI only the bottom 64bit have to
be saved I start to like the idea of teaching our unwinder about the
smaller floating point registers because that will mean that we can inspect
float and double variables from the parent stack frame while we never show
some corrupted data for variables occupying the full 128 bit vector
register.
>>
>> "...the unwind plan generated currently records this as v31 being
saved...": I don't understand this part. The code never references v31 nor
d31 so if the unwind plan records it as we saved v31 then something is
clearly wrong there. Does it treat all d0-d31 register as v31 because of
some instruction decoding reason?
>>
>> Tamas
>>
>> On Wed, Oct 12, 2016 at 4:30 PM Jason Molenda <jmolenda at apple.com> wrote:
>> Hi Tamas, sorry for that last email being a mess, I was doing something
else while writing it and only saw how unclear it was after I sent it.  You
understood everything I was trying to say.
>>
>> I looked at the AAPCS64 again.  It says v8-v15 are callee saved, but
says, "Additionally, only the bottom 64-bits of each value stored in v8-v15
need to be preserved[1]; it is the responsibility of the caller to preserve
larger values." so we're never going to have a full 128-bit value that we
can get off the stack (for targets following this ABI).
>>
>> DWARF can get away with only defining register numbers for v0-v31
because their use in DWARF always includes a byte size.  Having lldb
register numbers and using that numbering scheme instead of DWARF is an
interesting one.  It looks like all of the arm64 targets are using the
definitions from Plugins/Process/Utility/RegisterInfos_arm64.h.  It also
only defines v0-v31 but the x86_64 RegisterInfos file defines pseudo
registers ("eax") which are a subset of real registers ("rax") - maybe we
could do something like that.
>>
>> I'm looking at this right now, but fwiw I wrote a small C program that
uses enough fp registers that the compiler will spill all of the preserved
registers (d8-d15) when I compile it with clang and -Os.  I'll use this for
testing
>>
>>
>>
>> I get a prologue like
>>
>> a.out`foo:
>> a.out[0x100007a08] <+0>:   0x6dba3bef   stp    d15, d14, [sp, #-96]!
>> a.out[0x100007a0c] <+4>:   0x6d0133ed   stp    d13, d12, [sp, #16]
>> a.out[0x100007a10] <+8>:   0x6d022beb   stp    d11, d10, [sp, #32]
>> a.out[0x100007a14] <+12>:  0x6d0323e9   stp    d9, d8, [sp, #48]
>> a.out[0x100007a18] <+16>:  0xa9046ffc   stp    x28, x27, [sp, #64]
>> a.out[0x100007a1c] <+20>:  0xa9057bfd   stp    x29, x30, [sp, #80]
>> a.out[0x100007a20] <+24>:  0x910143fd   add    x29, sp, #80
; =80
>>
>>
>> and the unwind plan generated currently records this as v31 being saved
in the first instruction and ignores the others (because that reg has
already been saved).  That's the easy bug - but it's a good test case that
I'll strip down into a unit test once we get this figured out.
>>
>>
>> J
>>
>>
>>> On Oct 12, 2016, at 2:15 PM, Tamas Berghammer <tberghammer at google.com>
wrote:
>>>
>>> Hi Jason,
>>>
>>> Thank you for adding unit test for this code. I think the current
implementation doesn't fail terribly on 16 vs 32 byte stack alignment
because we use the "opc" from the instruction to calculate the write back
address (to adjust the SP) so having the wrong size of the register won't
effect that part.
>>>
>>> Regarding the issue about s0/d0/v0 I don't have a perfect solution but
here are some ideas:
>>> * Use a different register numbering schema then DWARF (e.g.
LLDB/GDB/???). I think it would only require us to change
EmulateInstructionARM64::CreateFunctionEntryUnwind to create an UnwindPlan
with a different register numbering schema and then to lookup the register
based on a different numbering schema using GetRegisterInfo in functions
where we want to reference s0-s31/d0-d31. In theory this should be a simple
change but I won't be surprised if changing the register numbering breaks
something.
>>> * Introduce the concept of register pieces. This concept already exists
in the DWARF expressions where you can say that a given part of the
register is saved at a given location. I expect that doing it won't be
trivial but it would solve both this problem and would improve the way we
understand DWARF as well.
>>>
>>> About your idea about saying that we saved v8&v9 even though we only
saved d8&d9: I think it is a good quick and dirty solution but it has a few
issues what is hard to solve. Most importantly we will lie to the user when
they read out v8&v9 what will contain some garbage data. Regarding
big/little endian we should be able to detect which one the inferior uses
and we can do different thing based on that (decide the location of v8&v9)
but it will make the hack even worth.
>>>
>>> Tamas
>>>
>>> On Tue, Oct 11, 2016 at 6:15 PM Jason Molenda <jmolenda at apple.com>
wrote:
>>> Hi Tamas, I'm writing some unit tests for the unwind source generators
- x86 last week, arm64 this week, and I noticed with this prologue:
>>>
>>>
>>>
>>> JavaScriptCore`JSC::B3::reduceDoubleToFloat:
>>>
>>>    0x192b45c0c <+0>:  0x6db923e9   stp    d9, d8, [sp, #-0x70]!
>>>
>>>    0x192b45c10 <+4>:  0xa9016ffc   stp    x28, x27, [sp, #0x10]
>>>
>>>    0x192b45c14 <+8>:  0xa90267fa   stp    x26, x25, [sp, #0x20]
>>>
>>>    0x192b45c18 <+12>: 0xa9035ff8   stp    x24, x23, [sp, #0x30]
>>>
>>>    0x192b45c1c <+16>: 0xa90457f6   stp    x22, x21, [sp, #0x40]
>>>
>>>    0x192b45c20 <+20>: 0xa9054ff4   stp    x20, x19, [sp, #0x50]
>>>
>>>    0x192b45c24 <+24>: 0xa9067bfd   stp    x29, x30, [sp, #0x60]
>>>
>>>    0x192b45c28 <+28>: 0x910183fd   add    x29, sp, #0x60            ;
=0x60
>>>
>>>    0x192b45c2c <+32>: 0xd10a83ff   sub    sp, sp, #0x2a0            ;
=0x2a0
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> EmulateInstructionARM64::EmulateLDPSTP interprets this as a save of
v31.  The use of reg 31 is an easy bug, the arm manual C7.2.284 ("STP
(SIMD&FP)") gives us an "opc" (0b00 == 32-bit registers, 0b01 == 64-bit
registers, 0b10 == 128-bit registers), an immediate value, and three
registers (Rt2, Rn, Rt).  In the above example, these work out to Rt2 == 8
(d8), Rn == 31 ("sp"), Rt == 9 (d9).  The unwinder is incorrectly saying
v31 right now because it's using Rn -
>>>
>>>
>>>
>>>  if (vector) {
>>>
>>>    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + n,
reg_info_Rt))
>>>
>>>      return false;
>>>
>>>    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + n,
reg_info_Rt2))
>>>
>>>      return false;
>>>
>>>  }
>>>
>>>
>>>
>>> This would normally take up 32 bytes of stack space and cause big
problems, but because we're writing the same reg twice, I think we luck out
and only take 16 bytes of the stack.
>>>
>>>
>>>
>>> We don't have dwarf register numbers for s0..31, d0..31, so we can't
track this instruction's behavior 100% correctly but maybe if we said that
>>>
>>>
>>>
>>> That would be an easy fix, like
>>>
>>>
>>>
>>>  if (vector) {
>>>
>>>    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + t,
reg_info_Rt))
>>>
>>>      return false;
>>>
>>>    if (!GetRegisterInfo(eRegisterKindDWARF, arm64_dwarf::v0 + t2,
reg_info_Rt2))
>>>
>>>      return false;
>>>
>>>  }
>>>
>>>
>>>
>>>
>>>
>>> We don't have dwarf register numbers for s0..31, d0..31, so I don't
think we can correctly track this instruction's actions today.  Maybe we
should put a save of v8 at CFA-112 and a save of v9 at CFA-104.  As long as
the target is operating in little endian mode, when we go to get the
contents of v8/v9 we're only actually USING the lower 64 bits so it'll work
out, right?  I think I have that right.  We'll be reading garbage in the
upper 64 bits - the register reading code won't have any knowledge of the
fact that we only have the lower 32/64 bits available to us.
>>>
>>>
>>>
>>>
>>>
>>> Throwing the problem out there, would like to hear what you think.  I
don't want to encode buggy behavior in a unit test ;) so I'd like it for us
to think about what correct behavior would be, and do that before I write
the test.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20161013/9b2d277c/attachment-0001.html>


More information about the lldb-commits mailing list