<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jul 3, 2016 at 11:22 PM, Jay K <span dir="ltr"><<a href="mailto:jay.krell@cornell.edu" target="_blank">jay.krell@cornell.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class=""> > These is metadata for epilogues (UWOP_EPILOG) but it is only available on Windows 8.1 and newer.<br>

<br>

</span>I'm aware of this.<br>

I believe it is so sampling profilers can walk the kernel stack including through paged code -- i.e. the epilogue data is not paged, while the related epilogue code might be.<br>

Do you see it used, i.e. in usermode?  (where the pdata/xdata/code are all equally paged). </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

It would allow for e.g. breakpoints in epilogues as well, but that doesn't seem to be a consideration.<br>

  Perhaps debuggers are supposed to detect epilogues and use hardware breakpoints instead??<br></blockquote><div><br></div><div>I don't see it used in practice but I can imagine JITs wanting to use it to liberate themselves from the normal x64 ABI rules regarding epilogues.  Reid and I spent a lot of time implementing the x64 compliant prologue/epilogue emission in LLVM and it would have been easier if UWOP_EPILOG was always around.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

<br>

<br>

And ps, while the documentation is good</blockquote><div><br></div><div><div>The documentation is good but it could be a little more clear.  I wish I could contact whoever maintains the specification...</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">, I think this basic point of what the goal is -- restoration of non-volatiles from arbitrary points, with the clarification/emphasis that rsp is a slightly special non-volatile -- is not clearly documented.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

It is from this motivation that everything pretty directly follows imho. </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

<br>

<br>

For example, this is why all ymm registers are all volatile -- because the xdata design precedes their existence and therefore cannot describe their preservation/restoration.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

<br>

<br>

 - Jay<br>

<br>

________________________________<br>

> From: <a href="mailto:david.majnemer@gmail.com">david.majnemer@gmail.com</a><br>

> Date: Sun, 3 Jul 2016 23:05:14 -0700<br>

<span class="">> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64?<br>

</span>> To: <a href="mailto:jay.krell@cornell.edu">jay.krell@cornell.edu</a><br>

> CC: <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<span class="">><br>

><br>

><br>

> On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev<br>

</span><span class="">> <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><mailto:<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>> wrote:<br>

>> Message: 3<br>

>> Date: Sun, 3 Jul 2016 17:49:50 -0700<br>

>> From: Michael Lewis via llvm-dev<br>

</span>> <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><mailto:<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>><br>

>> To: Hayden Livingston<br>

> <<a href="mailto:halivingston@gmail.com">halivingston@gmail.com</a><mailto:<a href="mailto:halivingston@gmail.com">halivingston@gmail.com</a>>><br>

>> Cc: llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><mailto:<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>><br>

<span class="">>> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64?<br>

>> Message-ID:<br>

>><br>

</span>> <<a href="mailto:CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD%2B9Edv2pi71Dw@mail.gmail.com">CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw@mail.gmail.com</a><mailto:<a href="mailto:CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD%252B9Edv2pi71Dw@mail.gmail.com">CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD%2B9Edv2pi71Dw@mail.gmail.com</a>>><br>

<span class="">>> Content-Type: text/plain; charset="utf-8"<br>

>><br>

>> On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston<br>

</span>> <<a href="mailto:halivingston@gmail.com">halivingston@gmail.com</a><mailto:<a href="mailto:halivingston@gmail.com">halivingston@gmail.com</a>>><br>

<span class="">>> wrote:<br>

>><br>

>>> For JITs it would appear that there is a patch needed for some kind of<br>

>>> relocations.<br>

>>><br>

>>> <a href="https://llvm.org/bugs/show_bug.cgi?id=24233" rel="noreferrer" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=24233</a><br>

>>><br>

>>> Is the patch really needed? What does it do? I'm not an expert here so<br>

>>> asking.<br>

>>><br>

>><br>

>><br>

>> I'm not really interested in the JIT case as I said originally, so I can't<br>

>> answer that question.<br>

>><br>

>><br>

>><br>

>>><br>

>>> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev<br>

</span><div><div class="h5">>>> <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><mailto:<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>> wrote:<br>

>>>> I can confirm that LLVM emits correct data when used in an AoT<br>

>>> configuration<br>

>>>> for x64, exception handling would be totally broken without it.<br>

>>>><br>

>>><br>

>><br>

>><br>

>> Two points of clarification:<br>

>><br>

>> - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)?<br>

>> Again given the presence of bugs going back to 2015 (including one linked<br>

>> in this thread) and other scant data from the list, I really can't tell<br>

>> what the expected state of this functionality is on Win64.<br>

>><br>

>> - Are you referring to data generated by LLVM that is embedded in COFF<br>

>> object files and then placed in the binary image by the linker? This data<br>

>> is at a minimum relocated by link.exe on Windows as near as I can tell. I<br>

>> do not want a dependency on link.exe. I can handle doing my own relocations<br>

>> prior to emitting the final image, but I want to know if there's a turnkey<br>

>> implementation of this already or if I have to roll my own here.<br>

>><br>

>> Thanks,<br>

>><br>

>><br>

>><br>

>> - Mike<br>

><br>

><br>

><br>

> Windows/x64 ABI is pretty well documented.<br>

><br>

><br>

> - The parameter passing is probably not the same as any other system.<br>

> (Unless people are using LLVM for UEFI development?)<br>

> Ignoring floating point, the first four integer parameters<br>

> are in rcx, rdx, r8, r9. The rest are on the stack.<br>

><br>

><br>

> - The exception handling might *resemble* other systems, but<br>

> surely has unique details.<br>

><br>

> - Ghere is absolutely an unremovable dependency on a linker;<br>

> it doesn't have to be the Microsoft linker, I believe GNU ld<br>

> already implements this.<br>

><br>

> The documentation should be used.<br>

><br>

> I can summarize and such, but it is documented.<br>

><br>

> Roughly, ignoring parameter passing and focusing only on exception<br>

> handling,<br>

> it goes like this:<br>

><br>

><br>

> - At any point in any program, "the stack" must be "unwindable".<br>

> I've never seen this clearly described.<br>

> It boils down to really "non volatile registers must be restorable"<br>

> by "a runtime" via a documented/standardized metadata, such as to<br>

> appear as if control was returned to any function on the call stack,<br>

> w/o running any generated code in any of the functions between<br>

> the current stack location and the resumed-to location.<br>

><br>

><br>

> The stack pointer is often called out specially, but in fact<br>

> it is just another non volatile register and not really a<br>

> special case.<br>

><br>

><br>

> So then some details:<br>

> a "leaf function" is a function that does not change any non<br>

> volatile registers,<br>

> including the stack pointer. Leaf functions can do pretty much<br>

> anything,<br>

> but they must not change any non volatile registers -- which is<br>

> a severe<br>

> restriction. Have locals essentially makes you non-leaf -- even if you<br>

> don't call anything. A leaf function is *not* a function that<br>

> makes no calls,<br>

> but calls do make a function a non-leaf, as it changes the stack<br>

> pointer.<br>

><br>

><br>

> The slight exception here is that all functions, including<br>

> leaves, do have<br>

> 4*8 bytes of scratch space in the stack available to them -- so local<br>

> variables can be had, in that space and in volatile registers.<br>

><br>

><br>

> The stack is walked from a leaf function merely by reading from rsp.<br>

> A leaf function can make a syscall, so they aren't necessarily at<br>

> the bottom of the stack.<br>

><br>

><br>

> non-leaf functions are the interesting ones.<br>

> They can change rsp, including such as via a call, and can change<br>

> non-volatile<br>

> registers, but all such changes (or rather, the saving of said<br>

> registers) must<br>

> be described by metadata, and the metadata<br>

> must be findable -- via looking up a code address on the stack.<br>

><br>

><br>

> Roughly speaking, all dlls have "pdata" -- procedure data.<br>

> There are 3 UINT32s per non-leaf function.<br>

> These are offsets into the image. Images are limited to 4GB in size.<br>

> They are to the start of the function, end of the function, and<br>

> to additional metadata.<br>

> The additional metadata is called "xdata" or exception data.<br>

> The offset to the metadata be be absent or 0, but that should be<br>

> rare/nonexistant<br>

> in practise -- it is for revealing leaf functions to static<br>

> analysis for example.<br>

><br>

><br>

> The "xdata" is then what describes how to restore non volatile<br>

> registers,<br>

> such as the order to pop them, or what offset they were saved at to the<br>

> frame pointer or stack pointer (and which register if any is the<br>

> frame pointer -- it doesn't have to be rbp,<br>

> and most functions don't have one.)<br>

><br>

><br>

> There are restrictions on code generation -- rsp changes and non<br>

> volatile saves<br>

> must be describable with this metadata. There is a notion of the<br>

> end of the prologue,<br>

> at this point all non volatiles that will be changed have been<br>

> saved, and rsp changes<br>

> are done. This is misleading though in that almost arbitrary code<br>

> can be interleaved<br>

> within the prologue, i.e. changes to volatile registers.<br>

><br>

><br>

> As well, as a background, generally Windows/x64 functions don't<br>

> change rsp,<br>

> except in their prologue and the call instruction.<br>

> They are not "pushy/poppp". However if a function uses _alloca, that<br>

> is a contradiction. Such functions must have a frame pointer,<br>

> such as rbp,<br>

> though it doesn't have to be rbp and often is not.<br>

><br>

><br>

> There is also a notion of chaining the data. This is useful when<br>

> a function has "early out" paths that only change some non volatiles.<br>

><br>

><br>

> Also there is allowance for discontiguous functions.<br>

><br>

><br>

> Also there is no metadata for epilogues. If an exception occurs<br>

> in an epilogue,<br>

> the runtime actually look at the code being run, detects it is an<br>

> epilogue<br>

> and simulates it. As such, epilogue code generation is constrained.<br>

> (and breakpoints within epilogues mess things up!)<br>

><br>

> These is metadata for epilogues (UWOP_EPILOG) but it is only available<br>

> on Windows 8.1 and newer.<br>

><br>

><br>

><br>

> To repeat -- the unwindability is from any single instruction, be<br>

> in the<br>

> middle of a prologue, middle of an epilogue, or in the body of a<br>

> function<br>

> outside of prologue/epilogue.<br>

><br>

><br>

> This unwindabilty serves both exception dispatch and debugger<br>

> stack walking,<br>

> and other things, like sampling profiler stack walking, or "leak<br>

> tracking<br>

> stack walking" -- stack walking is always possible, modulo bugs.<br>

> The most common bugs are probably in hand written assemble, since<br>

> assembly programmers have to do basically the work themselves.<br>

><br>

><br>

> There is provision for providing the pdata at runtime for JITed code.<br>

><br>

><br>

> The linker has to combine all the pdata and place a pointer<br>

> (offset) to it<br>

> in a documented place in the PE, similar to how imports and<br>

> exports and base<br>

> relocations are recorded.<br>

><br>

><br>

> Anyway, see the documentation.<br>

><br>

><br>

> - Jay<br>

> _______________________________________________<br>

> LLVM Developers mailing list<br>

</div></div>> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><mailto:<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

                                          </blockquote></div><br></div></div>