[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors
Ron Brender via llvm-dev
llvm-dev at lists.llvm.org
Sun Jan 28 15:14:23 PST 2018
>Hi John & Ron, I read through the proposal and had a couple of quick
observations.
Hi Jason. Thanks for the interesting observations and questions.
>1. The proposed encoding assumes that the epilogue instructions always
come at the end of the function -- or rather, just before the next
function. If there is a stack protector __stack_chk_fail sequence, or
there is NOP padding between functions, then the epilogue cannot be
expressed. The proposed encoding allows for instructions scheduled
before the prologue, as long as they're guaranteed to never throw an
exception or spill registers, but there's no similar affordance for
after the epilogue.
There are at least two alternatives for this situation:
a. The main group can be ended at the RET instruction and
a new group started to cover the following bytes.
Interestingly, both NOPs and a __stack_chk_fail call
sequence are not really unwindable instructions.
Perhaps a new NOUNWIND mode should be added for such
cases; this would protect against an asynchronous
exception attempting what is likely a dangerous stack
walk.
b. Starting a new group is admittedly a rather large hammer
that is perhaps OK if rare but not if it is common. If
common, an alternative might be to widen the E (epilogue)
field to allow more idiomatic alternatives. 4 bits might
be enough: 8 codes for a RET followed by 0 to 7 NOPs +
1 code for RET + __stack_chk_fail call + 1 code for
just a __stack_chk_fail call + 1 code for no RET and
nothing following. For anything else, fall back to a.
Personally I rather like the combination of a. + b.
>2. In section 3.1, "A null frame (MODE = 8) is the simplest possible
frame, with no allocated stack of either kind (hence no saved
registers). A null frame might be considered just a special case of a
RSP-based frame with a stack size of zero. But unlike other frames,
its frame pointer is usually not 16-byte aligned." Did you meant stack
pointer here? The x86_64 SysV abi requires 16-byte alignment of the
stack pointer before a CALL instruction, but I don't know of any
alignment requirements on the frame pointer.
Yes, it would be clearer to say the stack pointer is not
16-byte aligned.
Aside: in my vocabulary, the frame pointer for a RSP-based
or null frame function is the stack pointer. (And I find it
an abuse of the English language to refer to a RSP-based
frame as "frameless".) So what I wrote strictly is correct,
but it does tend to confuse. Sorry...
3. In the same section, you propose an alternative "default null frame
group" be defined,
>"If the first attempt to lookup an unwind group for an exception
address fails, then it is (tentatively) assumed to have occurred
within a null frame function or in a part of a function that is
adequately described by a null frame. The presumed return address is
(virtually or actually) popped from the top of stack and looked up.
This second attempted lookup must succeed, in which case processing
continues normally. A failure is a fatal error."
I suspect I missed a part of the proposal covering this, but if
entries are described by only start address, is there a problem of the
previous non-null function's range extending & picking up these
functions if they don't have entries?
For normal entries, I admit I sorta hand waved in Section 3.4
over how to determine the end address of the last group. There
are lots of alternatives. Most involve some kind of dummy
extended parameter group (MODE =12) where the STARTING-ADDRESS
is one more than the last address of the last "real" group.
There might also be linker assistance of some kind. In any
case I think it a problem that need not be solved upfront but
can wait for broader issues to settle.
The default unwind group has no starting address, of course.
It is defined by what is left over after all other normal
groups have been considered.
>4. For frameless functions -- small-stack RSP functions -- you're
assuming that the compiler saves and restores spilled registers with
MOV instructions instead of PUSH/POP instructions. I think this is
the normal behavior of the compiler, something like
subq $0x98, %rsp
movq %rsi, 0x10(%rsp)
movq %rdi, 0x18(%rsp)
or whatever, but the proposed encoding can only allow for one change
in stack size. The current encoding requires this for large-stack RSP
functions (we get the offset to the RSP instruction where we read the
amount being decremented in the compact unwind encoding), but for
small stack frameless functions the compiler can today express in
compact unwind a combination of sub and push instructions as long as
it's a fixed amount within the range that the small-stack RSP encoding
can express.
You are quite right, with the understanding that "the range
that the small-stack RSP encoding can express" is in fact the
body of the function and excludes the prologue and epilogue.
Our proposal, of course, is trying to describe the prologue and
epilogue as well.
My understanding of modern x86 processors is that there is
no meaningful performance difference between using pushes/pops
versus moves. That is, using only moves in prologues and
epilogues is a negligible price to pay in our quest for
a simple and compact unwind representation.
But I do take the point that this should be mentioned in
Section 3.5 regarding code generation implications.
Thanks again,
Ron
On Fri, Jan 26, 2018 at 7:19 PM, Jason Molenda via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi John & Ron, I read through the proposal and had a couple of quick
> observations.
> ...
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180128/989cf751/attachment-0001.html>
More information about the llvm-dev
mailing list