[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors

Mon Jan 29 18:48:13 PST 2018

For what it's worth, there are two implementations of consuming linked compact unwind in lldb http://lldb.llvm.org .  In tools/compact-unwind there is a standalone reader that I wrote in 2014 to make sure I understood the format from the compact_unwind_encoding.h header when I first read it.  I should really go back and make sure that it is correct all the way through or remove it, I have a vague memory that it may not interpret something correctly in how it decodes the unwind state information.  

I can more strongly recommend the reader in source/Symbol/CompactUnwindInfo.cpp -- lldb has been using this information for unwinds for a few years now.  This is written as a part of lldb so it won't be quite as simple to understand, but it's all there.  lldb has been using this for for a few years now.  I don't read/use the LSDA or personality information yet.

> On Jan 29, 2018, at 1:36 PM, Ron Brender via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Hi Nick,
> 
> It is a pleasure to be in contact with the creator of the compact unwind approach!
> 
> I can see how an array of 32-bit unwind blocks could be used to describe each distinct point within a function (within a prolog in particular). But then you end up with six or seven or more such blocks for a large percentage of functions, don't you? Seems like a lot of additional space for something that is usually simple, compact (sic) and idiomatic and dilutes the original benefit. We are seeking a summary description in a small fixed size to supplement the base unwind block.
> 
> Since you mention it, why have the UNWIND_IS_NOT_FUNCTION_START flag? Seems like you don't need it for unwinding purposes. Especially in a function that has a code layout where the entry point is not the first/lowest addressed instruction of the function (which I have seen in some GEM produced code for Alpha). An aside, but just curious.
> 
> Not being a MAC guy the macOS two-level lookup scheme has always been rather a mystery. What little I know is gleaned from the compact_unwind_encoding.h file found around the net. But I have never seen the .c/.cxx/.cpp that goes with it to better understand how the mapping works. Always figured the was considered proprietary but maybe I just don't know where/how to look. Is there public code and/or design note information to look at?
> 
> You are correct that this paper tends toward trying to create a scheme with little or no linker involvement. There are good reasons to ask more of the linker for better space economy, so this is still under discussion. 
> 
> You are also correct that the unwinder needs to interpret the additional prolog/epilog information. But the rest of your comment has me confused: the extra 2-bytes is just a widening as you suggest, right? And there is no Section 5.1--did you mean 3.1?
> 
> Thanks again,
> Ron
> 
> On Fri, Jan 26, 2018 at 9:54 PM, Nick Kledzik via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> John and Ron,
> 
> I developed the original compact unwind implementation for macOS 10.6 back in 2009.  I tried to leave space in the design to support finer grain exception handling such as for asynchronous or for the shrink wrap optimization.   The idea I had at the time was  instead of having just one 32-bit compact unwind info per function, there could be an array of them each covering a different range. All but the first would have the UNWIND_IS_NOT_FUNCTION_START bit set.
> 
> On macOS, the linker takes the 32-byte “compact unwind group description” records and folds them down into a complete different (read only) runtime table which basically maps an offset in the DSO into a 32-bit compact info for that address.
> 
> It seems like from your proposal that your linker keeps the “compact unwind group description” records as is and just concatenates them, and the runtime would need to understand the prolog/epilog encoding.  I would think that just widening the group record (as you suggest in 5.1) would be much simpler (like SHT_REL was widened to SHT_RELA).
> 
> -Nick
> 
> 
> 
> > On Jan 26, 2018, at 7:54 AM, John Reagan via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> >
> > Here is our proposal to extend/enhance the x86-64 compact unwind
> > descriptors to fully describe the prologue/epilogue for asynchronous
> > unwinding.  I believe there are missing/lacking CFI directives as well,
> > but I'll save that for another thread.
> >
> >
> > Asynchronous Compact Unwind Descriptors
> > Ron Brender, VMS Software, Inc.
> > Revised January 25, 2018
> >
> > 1  Introduction
> > This document proposes means to extend so-called compact unwind
> > descriptors to support fully
> > asynchronous exception handling. This will make extended compact unwind
> > descriptors an
> > alternative to DWARF CFI (call frame information) for achieving
> > asynchronous exception
> > support.
> >
> > Compact unwind descriptors can and have been used in both 64- and 32-bit
> > environments.
> > However, this proposal addresses only 64-bit environments. While the
> > ideas presented here can
> > be readily adapted for use in a 32-bit environment, for simplicity this
> > document makes no
> > attempt to do so.
> >
> > There are generally three kinds of information that together form the
> > heart of modern software
> > exception handling systems:
> >
> >  1. Information that is used to divide the remaining unwind information
> > into groups that are
> >     specific to particular regions of memory (often, but not
> > necessarily, associated with a
> >     single function) as well as provide a way to efficiently search for
> > and identify the
> >     grouping that is associated with a particular address in memory.
> >
> >  2. Information that can be used to virtually or actually unwind from
> > the call frame of an
> >     executing function to the call frame of its caller (at the point of
> > the call).
> >
> >  3. Identification of an associated personality routine that is invoked
> > by the general exception
> >     handling mechanization to guide how processing should proceed for a
> > given function, as
> >     well as additional “language specific data” needed for the
> > personality routine to do its
> >     job. Note that the personality routine and its related data are
> > specified as an adjunct to
> >     compiled code and totally opaque to the general mechanism (other
> > than the specified
> >     interface).
> >
> > Note in particular that C++ exception handling is built on top of a
> > personality routine and
> > language specific data area ABI that itself can be implemented using
> > either DWARE CFI or
> > extended compact unwind information as described here. The choice
> > between the two is
> > transparent to C++.
> >
> >
> > 2  Compact Unwind Overview
> > This section provides a brief overview of key features of the LLVM
> > compact unwind design. It
> > does not attempt a comprehensive re-statement of all aspects of the
> > design except to the extent
> > necessary to motivate and understand later proposed changes and
> > enhancements.
> >
> > 2.1  Compact Unwind Group Description
> > A compact unwind group consist of five fields, as follows:
> >
> > | 63                             32 31                            0 |
> > |-------------------------------------------------------------------|
> > | STARTING-ADDRESS                                                  |
> > | LENGTH                           | COMPACT-UNWIND-DESCRIPTION     |
> > | PERSONALITY-FUNCTION-POINTER                                      |
> > | LANGUAGE-SPECIFIC-DATA-ADDRESS                                    |
> > |-------------------------------------------------------------------|
> >
> > STARTING-ADDRESS (64-bits) is the lowest address of a region of memory
> > occupied by some
> > code, typically the entry point of a function.
> >
> > LENGTH (32-bits) is the number of bytes included in this group,
> > typically including all and only
> > the code of a function.
> >
> > COMPACT-UNWIND-DESCRIPTION (32-bits) is a description of the fully
> > formed frame of a
> > function and how to unwind it. This is described further following.
> >
> > PERSONALITY-FUNCTION-POINTER (64-bit) is a pointer to the personality
> > routine.
> >
> > LANGUAGE-SPECIFIC-DATA-ADDRESS (64-bits, sometimes abbreviated LSDA) is a
> > pointer to some data to be passed to the personality routine when it is
> > called.
> >
> > A key observation is that the starting address plus length way of
> > describing a group means that
> > the set of groups for a compilation unit need not describe all of the
> > code in that unit. In
> > particular, it appears to be expected that no unwind information need be
> > generated for leaf
> > functions.
> >
> > On the other hand, it is reasonable to expect that the groups that are
> > emitted are ordered by the
> > starting address. This means that a simple and fast binary search can be
> > used to map an address
> > to the group that applies to that address, if any.
> >
> > It is useful to note that the run-time representation of unwind
> > information can vary from little
> > more than a simple concatenation of the compile-time information to a
> > substantial rewriting of
> > unwind information by the linker. The proposal favors simple
> > concatenation while maintaining
> > the same ordering of groups as their associated code.
> >
> > 2.2  Compact Unwind Frame Description
> > A compact unwind frame description describes a frame in sufficient
> > detail to be able to unwind
> > that frame to the frame of its caller.
> >
> > | 31    28 27    24 23                                            0 |
> > |-------------------------------------------------------------------|
> > | FLAGS   | MODE   |                                                |
> > |-------------------------------------------------------------------|
> >
> > At the top most level, there are four bits that are not of further
> > interest here. Interpretation of
> > these bits is neither used nor changed.
> >
> > Also at the top-level is a 4-bit mode field. This is the tag of a
> > discriminated (tagged, variant)
> > union that selects the interpretation of the remaining 24 bits.
> >
> > Of the 16 possible modes, only 5 are defined :
> >
> > Code    Meaning      Description
> > 0       Old          “Old” is presumed to refer to some historical
> > design that is no longer of interest.
> >                      It is treated here as Reserved.
> > 1       RBP-based    The frame uses the RBP register as a frame pointer.
> > The size of the frame can
> >         frame        vary during execution.
> > 2       RSP-based    The frame uses RSP as the frame pointer. The size
> > of the frame is fixed (at
> >         frame        compilation time).
> > 3       Large RSP-   The frame uses RSP as the frame pointer, The size
> > of the frame is fixed (at
> >         based frame  compilation time); however, that size is too large
> > to express within this 32-bit
> >                      descriptor encoding.
> > 4       DWARF        The frame, for whatever reason, cannot be
> > adequately described using the
> >         escape       compact unwind frame description. The remaining
> > 24-bits are an index into
> >                      what the DWARF standard calls the .debug_frame
> > section (__eh_frame in
> >                      LLVM).
> >
> > 2.2.1  RBP-based Frame (MODE=1)
> > For a RBP-based frame, the remaining 24 bits are encoded as follows:
> >
> > | 23        16 | 15 | 14                                          0 |
> > |-------------------------------------------------------------------|
> > | OFFSET       | 0  | REGS                                          |
> > |-------------------------------------------------------------------|
> >
> > In a RBP-based frame the RBP register is pushed on the stack immediately
> > after the return
> > address, then RSP is moved to RBP. To unwind, RSP is restored with the
> > current RPB value,
> > then RBP is restored by popping off the stack, and the return is done by
> > popping the stack once
> > more into the instruction pointer.
> >
> > All preserved registers are saved in a small range in the stack that
> > starts at RBP-8 to RBP-2040.
> > The offset/8 relative to RBP is encoded in the 8-bit OFFSET field. The
> > registers saved are
> > encoded in the 15-bit REGS field as five 3-bit entries.
> >
> > 2.2.2  RSP-Based Frame (MODE=2)
> > For a RSP-based frame, the remaining 24 bits are encoded as follows:
> >
> > |23        16 | 15      13 | 12      10 | 9                       0 |
> > |-------------------------------------------------------------------|
> > | SIZE        |            | CNT        | REG_PERM                  |
> > |-------------------------------------------------------------------|
> >
> > In a RSP-based frame the stack pointer serves directly as the frame
> > pointer and RBP is available
> > for use as a general register. Upon entry, the stack pointer is
> > decremented by 8*SIZE bytes (the
> > maximum stack allocation is thus 2040 bytes). To unwind, the stack size
> > is added to the stack
> > pointer, and completed by popping the stack once more into the
> > instruction pointer.
> >
> > All preserved registers are saved on the stack immediately after the
> > return address.  The number
> > of registers saved (up to 6) is encoded in the 3-bit CNT field. The
> > 11-bit REG_PERM field
> > encodes which registers were saved and in what order.
> >
> > 2.2.3  Large RSP-Based Frame (MODE=3)
> > For a large RSP-based frame, the remaining 24 bits are encoded as follows:
> >
> > |23        16 | 15      13 | 12      10 | 9                       0 |
> > |-------------------------------------------------------------------|
> > | SIZE        | ADJ        | CNT        | REG_PERM                  |
> > |-------------------------------------------------------------------|
> >
> > This case is like the previous, except the stack size is too large to
> > encode in the compact unwind
> > encoding.  Instead, the function must include a "subq $nnnnnnnn, RSP"
> > instruction in its
> > prologue to allocate the stack.  The offset from the entry point of the
> > function to the nnnnnnnn
> > value in the function is given in the SIZE field.
> >
> > Depending on the exact instructions used to save registers (PUSH versus
> > MOV), the nnnnnnnn
> > value in the instruction stream may not be quite the full stack size.
> > ADJ * 8 is the additional
> > adjustment needed to get the actual size.
> >
> > 2.2.4  DWARF Escape (MODE=4)
> > The frame, for whatever reason, cannot be adequately described using a
> > compact unwind frame
> > description. The remaining 24-bits are an index into what the DWARF
> > standard calls the
> > .debug_frame section (called __eh_frame in LLVM).
> >
> >
> > 3  Asynchronous Changes and Enhancements
> > It is immediately obvious that omission of unwind information for leaf
> > functions (with any kind
> > of frame) precludes handling an exception that might occur during its
> > execution. It follows that
> > unwind information must cover all of the code of a module (with one
> > exception discussed
> > below). But if successive unwind groups are ordered (as previously
> > assumed) and also leave no
> > gaps, then the LENGTH field is redundant and can be omitted. The
> > beginning address of a
> > following group is always one byte past the end of the predecessor
> > group. There remains only the
> > question of how to identify the last group of a set.
> >
> > It should also be clear that the unwind representation described in the
> > prior section is not
> > sufficient to unwind from an asynchronous exception that might occur in
> > either the prologue or
> > epilogue of a function. To see this consider what would happen if an
> > exception occurred at either
> > the entry point or the return instruction of either a RBP- or RSP-frame
> > function. To be able to
> > handle asynchronous exceptions at any point during function execution,
> > it is necessary to add
> > additional information to each unwind group.
> >
> > These two considerations can be combined. The result is simply to
> > repurpose the LENGTH field
> > to encode prologue and epilogue information.
> >
> > 3.1  Extended MODEs
> > To preserve backward compatibility and to allow intermixing of
> > traditional and extended
> > compact unwind groups, new MODEs are defined as follows:
> >
> > Code        Meaning                   Description
> > ----------------------------------------------------------------------------------------------
> > 8           Null frame                See below.
> > 9           Extended RBP-based frame  Like mode 1 but combined with
> > extended prologue/epilogue
> >                                       information in place of a LENGTH field
> > 10          Extended RSP-based frame  Like mode 2 but combined with
> > extended prologue/epilogue
> >                                       information in place of a LENGTH field
> > 11          Extended Large RSP-based  Like mode 3 but combined with
> > extended prologue/epilogue
> >             frame                     information in place of a LENGTH field
> > 12          Extended parameter(s)     See below.
> >
> > A null frame (MODE = 8) is the simplest possible frame, with no
> > allocated stack of either kind
> > (hence no saved registers).  A null frame might be considered just a
> > special case of a RSP-based
> > frame with a stack size of zero. But unlike other frames, its frame
> > pointer is usually not 16-byte
> > aligned.
> >
> > It appears technically feasible for a null frame function to have a
> > personality routine. However,
> > the utility of such a capability seems too meager to justify allowing
> > this. We propose to not
> > support this.
> >
> > There remains the question of whether it is necessary or at least
> > desirable to strictly apply the
> > requirement that “all” code be covered by an unwind group. Based on
> > successful experience in
> > OpenVMS on the Itanium architecture (as well as Windows systems we
> > think), this alternative is
> > proposed:
> >
> >     If the first attempt to lookup an unwind group for an exception
> > address fails, then it is
> >     (tentatively) assumed to have occurred within a null frame function
> > or in a part of a function
> >     that is adequately described by a null frame. The presumed return
> > address is (virtually or
> >     actually) popped from the top of stack and looked up. This second
> > attempted lookup must
> >     succeed, in which case processing continues normally. A failure is a
> > fatal error.
> >
> > This concept of a “default null frame group” is especially convenient
> > for dealing with
> > disconnected code sequences such as trampolines, PLTs, and the like.
> >
> > The extended RBP-based and RSP-based frame modes (MODEs = 9 and 10) are
> > simple
> > supersets of their more traditional predecessors.
> >
> > The large RSP-based frame mode (MODE = 11) is also a superset of mode 3
> > except that instead
> > of finding the stack size in the instruction stream, it is found in a
> > preceding extended parameter
> > unwind group (MODE = 12). This difference is essential for support of
> > execute-only code.
> >
> > To understand the extended parameter group, suppose that two groups
> > occur one after the other
> > but have the same STARTING-ADDRESS value. A binary search using the
> > STARTING-
> > ADDRESS field will ignore the first of the pair because the address
> > being sought can never be at
> > least as large as the first but less than the second. Such a group can
> > be used as an escape
> > convention that allows inserting additional information into the
> > sequence of groups that
> > otherwise cannot be easily included.
> >
> > The extended modes are described more fully in the following sections.
> >
> > 3.2  Function Parts
> > Functions are generally considered to consist of three distinct parts:
> >
> > 1. A prologue, which allocates local storage, saves registers that must
> > be preserved for the
> >    benefit of callers, and possibly other housekeeping (such as
> > initializing any local state
> >    necessary for the correct management of the function’s execution).
> >
> > 2. A body, which performs the main work of the function.
> >
> > 3. An epilogue, which makes the result available to the caller, and
> > undoes the effects of the
> >    prologue (restores preserved registers, deallocates local storage,
> > and so on).
> >
> > Here a word of caution is in order. This vocabulary tends to be applied
> > at multiple levels in
> > software architecture and each level has its own set of issues to
> > consider and resulting
> > requirements to be observed.
> >
> > Note that a null frame function has no distinct prologue, body or
> > epilogue. Every instruction can
> > be viewed as simultaneously in all three parts, or in none of them.
> >
> > This leads to the following proposal for the upper part of the extended
> > compact unwind
> > quadword for use in combination with MODE = 8 in the lower part.
> >
> > | 63                            48 | 47                          32 |
> > |-------------------------------------------------------------------|
> > | RESERVED                         | 0            ...             0 |
> > |-------------------------------------------------------------------|
> >
> >
> > | 31                            16 | 15                           0 |
> > |-------------------------------------------------------------------|
> > | 0 0 0 0 | 1 0 0 0 | 0   ...    0 | 0            ...             0 |
> > |-------------------------------------------------------------------|
> >
> >
> >
> > 3.2.1  Function Prologue
> > For the purposes of exception handling, the key steps are:
> >
> >  1. Allocate space on the stack in which to save preserved registers and
> > for other purposes
> >  2. Save the registers that must be preserved.
> >
> > The DWARF call frame information (CFI) provides a description that is
> > precise and accurate at
> > each and every instruction in a function (not limited to just the
> > prologue). But experience shows
> > that this representation is bulky in space and expensive in time to
> > decode and interpret. Indeed,
> > compact unwind descriptors as described in Section 2 were created to
> > avoid these issues.
> >
> > Experience with OpenVMS on the Alpha and Itanium architectures shows
> > that a key constraint
> > can greatly simplify the unwind description: require that no preserved
> > register can be changed
> > until all of them have been saved. The last such save ends the prologue
> > and the next instruction
> > begins the body. An unwind does not need (must not) restore any
> > preserved registers because
> > they are still valid. A simple length from the entry point suffices to
> > describe this boundary.
> > [The body does not necessarily need to begin until at least one of the
> > preserved registers is
> > actually changed.]
> >
> > 3.2.1.1  RSP-Based Frame (MODE = 10)
> > For a RSP-based frame, it is also necessary to know which instruction
> > adjusts the stack pointer:
> > before this instruction an unwind must only (virtually or actually) pop
> > the return address; and
> > after this instruction, the stack pointer must first be incremented to
> > deallocate the stack frame,
> > then the instruction pointer restored.
> >
> > This leads to the following proposal for the upper part of the extended
> > compact unwind
> > quadword for use in combination with MODE = 10 in the lower part.
> >
> > | 63                       48 47 46             40 39            32 |
> > |-------------------------------------------------------------------|
> > | RESERVED                   |  |  PROLOGUE-SIZE2 |  PROLOGUE-SIZE1 |
> > |-------------------------------------------------------------------|
> >
> >
> > Except for the MODE, the lower part is exactly like it would be for MODE
> > = 2.
> >
> > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue
> > relative to the
> > STARTING-ADDRESS, up to and including the instruction that allocates the
> > stack.
> >
> > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue
> > relative to the end
> > of the first part, up to a point after the last preserved register is
> > saved and before the first
> > preserved register is changed. (Recall, this point may not be unique.)
> >
> > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue.
> >
> > The maximum prologue size allowed here is much greater than will be
> > typical. This is deliberate
> > to allow compilers freedom to use shrink-wrap type optimizations in
> > which safe operations that
> > need no body state are moved from the body to the beginning of the
> > prologue. This avoids the
> > cost of stack frame setup and teardown for simple special case handling
> > that often leads to an
> > early exit.
> >
> > 3.2.1.2  Large RSP-based Frame (MODE = 11)
> > This case is like the previous, except the stack size is too large to
> > encode in the compact unwind
> > encoding.  The SIZE field is interpreted as an offset relative to the
> > beginning of the containing
> > unwind group to (what is necessarily) an earlier extended parameter group.
> >
> > 3.2.1.3  RBP-Based Frame (MODE = 9)
> > The RBP-based frame is the same as for a RSP-based frame except that
> > there are two instructions
> > that mark the transition between the two parts of the prologue: the push
> > of the prior RBP value
> > onto the stack followed by the copy of RSP to RBP to establish the new
> > frame pointer.
> >
> > There appears to be no reason to not make this simplifying requirement:
> > the push and copy
> > always occur together.
> >
> > This leads to the following proposal for the upper part of the extended
> > compact unwind
> > quadword for use in combination with MODE = 9 in the lower part.
> >
> > | 63                       48 47 46             40 39            32 |
> > |-------------------------------------------------------------------|
> > | RESERVED                   |  |  PROLOGUE-SIZE2 |  PROLOGUE-SIZE1 |
> > |-------------------------------------------------------------------|
> >
> > Except for the MODE, the lower part is exactly like it would be for MODE
> > = 1.
> >
> > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue
> > relative to the
> > STARTING-ADDRESS, up to and including the instruction that pushes the
> > prior RBP contents.
> >
> > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue
> > relative to the end
> > of the first part, up to a point after the last preserved register is
> > saved and before the first
> > preserved register is changed. (Recall, this point may not be unique.)
> >
> > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue.
> >
> > 3.2.2  Function Epilogue
> > For the purposes of exception handling, the key steps in an epilogue are:
> >
> >  1. Restore the registers that were preserved
> >  2. Deallocate space on the stack that was used to preserve registers
> > and for other purposes
> >
> > A key observation is that restoring registers is idempotent. That is, if
> > in the midst of restoring
> > registers when an exception occurs, it will do no harm if all of the
> > preserved registers are
> > restored again when unwinding. In effect, restoration of registers need
> > not be distinguished from
> > the body of a function. It is not until reaching the first instruction
> > that deallocates stack is
> > executed that the “real” epilogue begins.
> >
> > 3.2.2.1  RSP-Based Frame (MODE = 10 or 11)
> > For a RSP-based frame, whether small or large, the only unwind action
> > remaining after stack
> > deallocation is to pop the return address into the instruction pointer
> > (RIP). In the vast majority of
> > cases this means that the epilogue of the code described by an unwind
> > group is just a 1-byte RET
> > instruction that occurs at the end of the unwind group. There are other
> > possibilities but they are
> > rare and con be dealt with separately (see below).
> >
> > It suffices to include a single 1-bit field, E, in the upper part of the
> > extended compact unwind
> > description:
> >
> > | 63                       48 47 46             40 39            32 |
> > |-------------------------------------------------------------------|
> > | RESERVED                  | E |  PROLOGUE-SIZE2 |  PROLOGUE-SIZE1 |
> > |-------------------------------------------------------------------|
> >
> >
> > The E flag indicates that the unwind group ends with a standard 1-byte
> > RET instruction.
> >
> > 3.2.2.2  RBP-Based Frame (MODE = 9)
> > The RBP-based frame is much the same as for a RSP-based frame except
> > that there are two
> > instructions that mark the transition between the two parts of the
> > prologue: the copy of RBP to
> > RSP to cut back the stack, followed by the pop of RBP to restore what
> > may be the callers frame
> > pointer. Thus the epilogue, for the purposes of unwinding, begins
> > immediately after the copy of
> > RBP to RSP. In the vast majority of cases, the immediately following two
> > instructions will
> > consist of POP RBP and RET. The POP is not idempotent because it changes
> > the stack pointer,
> > but the unwind processing code can distinguish whether the stack has
> > been popped or not based
> > on the address (3 versus 1 byte less than the highest address of the group).
> >
> > It suffices to include a single 1-bit field, E, in the upper part of the
> > extended compact unwind
> > description:
> >
> > | 63                       48 47 46             40 39            32 |
> > |-------------------------------------------------------------------|
> > | RESERVED                  | E |  PROLOGUE-SIZE2 |  PROLOGUE-SIZE1 |
> > |-------------------------------------------------------------------|
> >
> > The E flag indicates that the unwind group ends with a standard
> > 3-instruction return sequence.
> >
> > 3.3  Special Issues
> > Up until this point, it might appear that each unwind group corresponds
> > one-for-one to the code
> > for a single function. While common, this is not required. Important
> > variations are illustrated
> > here.
> >
> > Issue/Problem                                        Possible Resolution
> > -------------------------------------------------------------------------------------------------
> > The first part of the prologue is too large to be    Use two unwind groups:
> > described by the 8-bit RxP-FRAME-SIZE1 field.        1) a null frame
> > group of any appropriate size,
> >                                                      2) a suitable
> > RxP-based frame.
> >
> > The second part of the prologue is too large to be   Use three unwind
> > groups:
> > described by the 8-bit RxP-FRAME-SIZE2 field.        1) a RSP-frame
> > group to describe the first part of
> >                                                      the prologue only
> > (PROLOGUE-SIZE1 = the
> >                                                      length of the
> > entire group and PROLOGUE-SIZE2
> >                                                      = 0,
> >                                                      2) a RSP-frame
> > group to describe the second part
> >                                                      of the prologue
> > (PROLOGUE-SIZE1 & 2 both = 0
> >                                                      and no registers
> > saved),
> >                                                      3) a suitable
> > RxP-based frame group (with
> >                                                      PROLOGUE-SIZE1 & 2
> > both = 0).
> >
> > There is more than one return location (without a    Generally use one
> > unwind group for each sequence
> > shared epilogue sequence).                           of code ending with
> > a RET. A group following a
> >                                                      RET will have no
> > prologue (PROLOGUE-SIZE1
> >                                                      & 2 both = 0).
> >
> > There is more than one entry point to a function     Generally use one
> > unwind group for each sequence
> > (for example, a FORTRAN multiple entry point         of code beginning
> > at an entry point. Each group
> > subroutine or analogous assembly language            may have a normal
> > prologue and all but the last
> > equivalents)                                         might have NO
> > epilogue (E flag clear).
> >
> >
> > These examples are not exhaustive, of course. But they illustrate the
> > flexibility of the
> > mechanism, which should be suitable for the vast majority of cases in
> > practice.
> >
> > 3.4  Number of Unwind Groups
> > A simple concatenation of unwind groups by the linker that combines
> > unwind group sections
> > from multiple modules does not, of itself, provide direct information to
> > the unwind software
> > regarding how many unwind groups are present. However, the image
> > information (ELF
> > segments) should suffice to determine this number based on the size of
> > the segment and the
> > known size of each group. Alternatively, the linker might create special
> > symbols that can be used
> > to help determine the number of groups.
> >
> > This proposal leaves this detail to target-specific linker, image loader
> > and exception handling
> > system to specify.
> >
> > 3.5  Interaction with Code Generation
> > The relatively simple and fixed size nature of the extended compact
> > unwind information
> > proposed here depends on observing certain restrictions on optimization
> > that might affect code in
> > prologues and epilogues. These restrictions were noted where relevant
> > throughout this section.
> > Here is a summary of those requirements:
> >
> >  1. Use of any preserved register must be delayed until all of the
> > preserved registers have
> >     been saved.
> >  2. In the prologue of a function with a RBP-based frame, the two
> > instructions that save the
> >     prior RBP contents and copy the stack pointer to RBP must be kept
> > adjacent. Similarly,
> >     in the epilogue of a function with a RBP-based frame, the two
> > instructions that cut back
> >     the stack and restore the prior RBP contents must be kept adjacent.
> > [This adjacency
> >     requirement could be relaxed by introducing an additional MODE that
> > corresponds to having
> >     the RBP saved in memory but not yet updated from RSP. This does not
> > seem worthwhile.]
> >  3. For shrink wrap optimization in the presence of a
> > handler/personality routine, care must
> >     be taken to not move instructions into the prologue that might cause
> > an exception.
> >     Floating point exceptions and access violations are of particular
> > concern in this regard.
> >
> > 4  System Specific Extensions for OpenVMS
> > Omitted since nobody other than OpenVMS should care but it describes
> > adding an additional
> > four byte extension to the extended compact unwind group to pass along
> > some information from
> > the VAX Macro-32 compiler about VAX register usage.
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev