[Lldb-commits] [PATCH] D77047: AArch64 SVE register infos and core file support

Tue Jul 14 21:53:10 PDT 2020

omjavaid marked an inline comment as done.
omjavaid added inline comments.

================
Comment at: lldb/source/Plugins/Process/elf-core/RegisterContextPOSIXCore_arm64.cpp:82-106
+uint32_t
+RegisterContextCorePOSIX_arm64::CalculateSVEOffset(uint32_t reg_num) const {
+  uint32_t sve_offset = 0;
+  if (m_sve_state == SVE_STATE::SVE_STATE_FPSIMD) {
+    if (IsSVEZ(reg_num))
+      sve_offset = (reg_num - GetRegNumSVEZ0()) * 16;
+    else if (reg_num == GetRegNumFPSR())
----------------
labath wrote:
> omjavaid wrote:
> > labath wrote:
> > > omjavaid wrote:
> > > > labath wrote:
> > > > > omjavaid wrote:
> > > > > > labath wrote:
> > > > > > > I'm confused by this function. If I understant the SVE_PT macros and the logic in `RegisterInfoPOSIX_arm64::ConfigureVectorRegisterInfos` correctly, then they both seem to encode the same information. And it seems to me that this function should just be `reg_infos[reg_num].offset - some_constant`, which is the same thing that we're doing for the arm FP registers when SVE is disabled, and also for most other architectures too.
> > > > > > > 
> > > > > > > Why is that not the case? Am I missing something? If they are not encoding the same thing, could they be made to encode the same thing?
> > > > > > This function calculates offset of a particular register in core note data. SVE data in core dump is similar to what PTrace emits and offsets into this data is not linear. SVE macros are used to access those offsets based on register numbers and currently selected vector length.
> > > > > > Also for the purpose of ease we have linear offsets in SVE register infos and it helps us simplify register data once it makes way to GDBRemoteRegisterContext on the client side.
> > > > > Could you give an example of the non-linearity of the core dump data? (Some registers, and their respective core file and gdb-remote offsets)
> > > > In case of core file we create a buffer m_sveregset which stores SVE core note information
> > > > m_sveregset =
> > > >       getRegset(notes, m_register_info_up->GetTargetArchitecture().GetTriple(),
> > > >                 AARCH64_SVE_Desc);
> > > > 
> > > > At this point we do not know what was the vector length and at what offsets in the data our registers are located. We read top bytes of size use_sve_header and read vector length. Based on this information we configure vector length in Register infos. While the SVE payload starts with user_sve_header then there are some allignment bytes followed by vector length based Z registers followed by P and FFR, then there are some more allginment bytes followd by FPCR and FPSR. Macros provided by Linux help us jump to the desired offset by providing register number and vq into the core note or Ptrace payload.
> > > > 
> > > > In case of client side storage we store GPRs at linear offset followed by Vector Granule register. Then there are SVE registers Z, P, FFR, FPSR and FPCR. Offsets of V, D and S registers in FPR regset overlap with corresponding first bytes of Z registers and will be same as corresponding Z register. While both FP/SVE FPSR share same register offset, size and register number.
> > > > 
> > > > Here is an excerpt from https://github.com/torvalds/linux/blob/master/Documentation/arm64/sve.rst
> > > > 
> > > > SVE_PT_REGS_FPSIMD
> > > > SVE registers are not live (GETREGSET) or are to be made non-live (SETREGSET).
> > > > The payload is of type struct user_fpsimd_state, with the same meaning as for NT_PRFPREG, starting at offset SVE_PT_FPSIMD_OFFSET from the start of user_sve_header.
> > > > Extra data might be appended in the future: the size of the payload should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags).
> > > > vq should be obtained using sve_vq_from_vl(vl).
> > > > 
> > > > or
> > > > 
> > > > SVE_PT_REGS_SVE
> > > > SVE registers are live (GETREGSET) or are to be made live (SETREGSET).
> > > > The payload contains the SVE register data, starting at offset SVE_PT_SVE_OFFSET from the start of user_sve_header, and with size SVE_PT_SVE_SIZE(vq, flags);
> > > Given this
> > > > SVE payload starts with ... followed by vector length based Z registers followed by P and FFR,
> > > and this
> > > > In case of client side storage we store GPRs ... Then there are SVE registers Z, P, FFR, FPSR and FPCR
> > > I would expect that for each of the Z, P and FFR registers, the expression `offset_in_core(reg) - offset_in_gdb_remote(reg)` is always the same constant (and is basically equal to `SVE_PT_SVE_ZREG_OFFSET(vq, 0) - reg_info[Z0].byte_offset`). So we could just add/subtract that constant to the gdb-remote byte_offset field instead of painstakingly decomposing the register number only for the linux macros to reconstruct it back again. Is that not so?
> > The standard never talks about Z, P and FFR being contagious that is what I learnt by reading macros. There standard states this:
> > 
> > If the registers are present, the remainder of the record has a vl-dependent size and layout. Macros SVE_SIG_* are defined [1] to facilitate access to the members.
> > Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from the start of the register's representation in memory.
> > If the SVE context is too big to fit in sigcontext.__reserved[], then extra space is allocated on the stack, an extra_context record is written in __reserved[] referencing this space. sve_context is then written in the extra space. Refer to [1] for further details about this mechanism.
> > 
> > I understand what you are talking about but given the macros were specifically provided and above line about register record was vague and I thought best is to follow the macros for offset calculation although other way around is simpler but may be slightly unreliable.
> > 
> > I suggest to keep this as it is unless there is strong reason apart from slight performance penalty. This resembles with GDB implementation which was done by ARM and I am only following that as reference. May be we can revise this in future when the feature becomes more mainstream. 
> > 
> I'm not worried about the performance penalty (most of these abstractions can be optimized away anyway). I'm more worried about the maintainability penalty incurred by a non-standard and fairly complicated solution.
> 
> The way I see it, it doesn't really matter how exactly the specification describes these things. Having the macros is nice, but it's not their names that become a part of the ABI -- their implementation does. So they (linux, arm, whoever) cannot change the layout of the these registers without breaking all existing applications (and core files) any more than they can change the layout of the general purpose registers (which we also access by offset, without any fancy macros). In fact, even if they did change the layout, given that we have a copy of these headers, we wouldn't automatically inherit those changes anyway, but would have to take some manual action. And since we'd probably want to maintain some sort of backwards compatibility, we couldn't just replace the new macro definitions, but would have to do some sort of versioning (just today I got reminded that lldb contains versioned layouts of various internal structs used by the ObjC runtime).
> 
> So, for that reason I am more concerned about consistency with other lldb register contexts than I am with consistency with gdb. 
> 
> Also, I bet gdb doesn't have an equivalent of our RegisterInfo struct. Without that, using these macros would be obviously better, but given that we have that, and we already use it to access other registers, ditching that for macros does not seem like a win to me.
> 
> Another way to look at this would be to compare this to e.g. ELF reading code in llvm. We cannot just include <linux/elf.h> as we want this to be cross-platform. However, we also don't just copy the header blindly into our code base. We don't depart from it needlessly, but we do make sure that it plays well with the rest of llvm -- `#defines` are changed to enums, we add templates to select between 64/32 bit versions, change some platform-specific names so that different platforms may co-exist, etc.
I understand your point here and by referring to Linux, Arm or GDB implementation I only wanted to make sure that we are implementing the architecture and platform specific parts correctly. The choice of using Linux/SVE specific macros in elf-core context was not a blind decision. I am not advocating for following a particular implementations and I would not have pulled in these Linux specific macros into elf-core implementation if the SVE core note layout was independent of Linux platform specifics like sig_context and user_sve_header. Offset to the start of register data in NT_ARM_SVE is also calculated after doing alignment correction based on VQ and size of sig_context. Given that SVE is currently supported by Linux only, NT_ARM_SVE is Linux only core note and core dump generated makes use of these macro definitions I had no choice but to include them for extraction of data from NT_ARM_SVE section. If I had not followed these macros I still would have written something similar my self for register data extraction from NT_ARM_SVE elf core note.

I am going to post another update where I ll try to minimize use of these macros in offset calculation.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77047/new/

https://reviews.llvm.org/D77047