[llvm-dev] Status of stack walking in LLVM on Win64?
David Majnemer via llvm-dev
llvm-dev at lists.llvm.org
Sun Jul 3 23:05:14 PDT 2016
On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev <llvm-dev at lists.llvm.org
> wrote:
> > Message: 3
> > Date: Sun, 3 Jul 2016 17:49:50 -0700
> > From: Michael Lewis via llvm-dev <llvm-dev at lists.llvm.org>
> > To: Hayden Livingston <halivingston at gmail.com>
> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64?
> > Message-ID:
> > <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston <
> halivingston at gmail.com>
> > wrote:
> >
> >> For JITs it would appear that there is a patch needed for some kind of
> >> relocations.
> >>
> >> https://llvm.org/bugs/show_bug.cgi?id=24233
> >>
> >> Is the patch really needed? What does it do? I'm not an expert here so
> >> asking.
> >>
> >
> >
> > I'm not really interested in the JIT case as I said originally, so I
> can't
> > answer that question.
> >
> >
> >
> >>
> >> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev
> >> <llvm-dev at lists.llvm.org> wrote:
> >>> I can confirm that LLVM emits correct data when used in an AoT
> >> configuration
> >>> for x64, exception handling would be totally broken without it.
> >>>
> >>
> >
> >
> > Two points of clarification:
> >
> > - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)?
> > Again given the presence of bugs going back to 2015 (including one linked
> > in this thread) and other scant data from the list, I really can't tell
> > what the expected state of this functionality is on Win64.
> >
> > - Are you referring to data generated by LLVM that is embedded in COFF
> > object files and then placed in the binary image by the linker? This data
> > is at a minimum relocated by link.exe on Windows as near as I can tell. I
> > do not want a dependency on link.exe. I can handle doing my own
> relocations
> > prior to emitting the final image, but I want to know if there's a
> turnkey
> > implementation of this already or if I have to roll my own here.
> >
> > Thanks,
> >
> >
> >
> > - Mike
>
>
>
> Windows/x64 ABI is pretty well documented.
>
>
> - The parameter passing is probably not the same as any other system.
> (Unless people are using LLVM for UEFI development?)
> Ignoring floating point, the first four integer parameters
> are in rcx, rdx, r8, r9. The rest are on the stack.
>
>
> - The exception handling might *resemble* other systems, but
> surely has unique details.
>
> - Ghere is absolutely an unremovable dependency on a linker;
> it doesn't have to be the Microsoft linker, I believe GNU ld
> already implements this.
>
> The documentation should be used.
>
> I can summarize and such, but it is documented.
>
> Roughly, ignoring parameter passing and focusing only on exception
> handling,
> it goes like this:
>
>
> - At any point in any program, "the stack" must be "unwindable".
> I've never seen this clearly described.
> It boils down to really "non volatile registers must be restorable"
> by "a runtime" via a documented/standardized metadata, such as to
> appear as if control was returned to any function on the call stack,
> w/o running any generated code in any of the functions between
> the current stack location and the resumed-to location.
>
>
> The stack pointer is often called out specially, but in fact
> it is just another non volatile register and not really a special
> case.
>
>
> So then some details:
> a "leaf function" is a function that does not change any non
> volatile registers,
> including the stack pointer. Leaf functions can do pretty much
> anything,
> but they must not change any non volatile registers -- which is a
> severe
> restriction. Have locals essentially makes you non-leaf -- even if
> you
> don't call anything. A leaf function is *not* a function that makes
> no calls,
> but calls do make a function a non-leaf, as it changes the stack
> pointer.
>
>
> The slight exception here is that all functions, including leaves,
> do have
> 4*8 bytes of scratch space in the stack available to them -- so
> local
> variables can be had, in that space and in volatile registers.
>
>
> The stack is walked from a leaf function merely by reading from rsp.
> A leaf function can make a syscall, so they aren't necessarily at
> the bottom of the stack.
>
>
> non-leaf functions are the interesting ones.
> They can change rsp, including such as via a call, and can change
> non-volatile
> registers, but all such changes (or rather, the saving of said
> registers) must
> be described by metadata, and the metadata
> must be findable -- via looking up a code address on the stack.
>
>
> Roughly speaking, all dlls have "pdata" -- procedure data.
> There are 3 UINT32s per non-leaf function.
> These are offsets into the image. Images are limited to 4GB in size.
> They are to the start of the function, end of the function, and to
> additional metadata.
> The additional metadata is called "xdata" or exception data.
> The offset to the metadata be be absent or 0, but that should be
> rare/nonexistant
> in practise -- it is for revealing leaf functions to static analysis
> for example.
>
>
> The "xdata" is then what describes how to restore non volatile
> registers,
> such as the order to pop them, or what offset they were saved at to
> the
> frame pointer or stack pointer (and which register if any is the
> frame pointer -- it doesn't have to be rbp,
> and most functions don't have one.)
>
>
> There are restrictions on code generation -- rsp changes and non
> volatile saves
> must be describable with this metadata. There is a notion of the end
> of the prologue,
> at this point all non volatiles that will be changed have been
> saved, and rsp changes
> are done. This is misleading though in that almost arbitrary code
> can be interleaved
> within the prologue, i.e. changes to volatile registers.
>
>
> As well, as a background, generally Windows/x64 functions don't
> change rsp,
> except in their prologue and the call instruction.
> They are not "pushy/poppp". However if a function uses _alloca, that
> is a contradiction. Such functions must have a frame pointer, such
> as rbp,
> though it doesn't have to be rbp and often is not.
>
>
> There is also a notion of chaining the data. This is useful when
> a function has "early out" paths that only change some non volatiles.
>
>
> Also there is allowance for discontiguous functions.
>
>
> Also there is no metadata for epilogues. If an exception occurs in
> an epilogue,
> the runtime actually look at the code being run, detects it is an
> epilogue
> and simulates it. As such, epilogue code generation is constrained.
> (and breakpoints within epilogues mess things up!)
>
These is metadata for epilogues (UWOP_EPILOG) but it is only available on
Windows 8.1 and newer.
>
>
> To repeat -- the unwindability is from any single instruction, be in
> the
> middle of a prologue, middle of an epilogue, or in the body of a
> function
> outside of prologue/epilogue.
>
>
> This unwindabilty serves both exception dispatch and debugger stack
> walking,
> and other things, like sampling profiler stack walking, or "leak
> tracking
> stack walking" -- stack walking is always possible, modulo bugs.
> The most common bugs are probably in hand written assemble, since
> assembly programmers have to do basically the work themselves.
>
>
> There is provision for providing the pdata at runtime for JITed code.
>
>
> The linker has to combine all the pdata and place a pointer (offset)
> to it
> in a documented place in the PE, similar to how imports and exports
> and base
> relocations are recorded.
>
>
> Anyway, see the documentation.
>
>
> - Jay
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160703/14fdf21a/attachment.html>
More information about the llvm-dev
mailing list