[llvm-dev] Status of stack walking in LLVM on Win64?

Sun Jul 3 23:22:46 PDT 2016

 > These is metadata for epilogues (UWOP_EPILOG) but it is only available on Windows 8.1 and newer.

I'm aware of this.
I believe it is so sampling profilers can walk the kernel stack including through paged code -- i.e. the epilogue data is not paged, while the related epilogue code might be.
Do you see it used, i.e. in usermode?  (where the pdata/xdata/code are all equally paged).
It would allow for e.g. breakpoints in epilogues as well, but that doesn't seem to be a consideration.
  Perhaps debuggers are supposed to detect epilogues and use hardware breakpoints instead??

And ps, while the documentation is good, I think this basic point of what the goal is -- restoration of non-volatiles from arbitrary points, with the clarification/emphasis that rsp is a slightly special non-volatile -- is not clearly documented.
It is from this motivation that everything pretty directly follows imho.

For example, this is why all ymm registers are all volatile -- because the xdata design precedes their existence and therefore cannot describe their preservation/restoration.

 - Jay

________________________________
> From: david.majnemer at gmail.com 
> Date: Sun, 3 Jul 2016 23:05:14 -0700 
> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? 
> To: jay.krell at cornell.edu 
> CC: llvm-dev at lists.llvm.org 
> 
> 
> 
> On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev 
> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: 
>> Message: 3 
>> Date: Sun, 3 Jul 2016 17:49:50 -0700 
>> From: Michael Lewis via llvm-dev 
> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> 
>> To: Hayden Livingston 
> <halivingston at gmail.com<mailto:halivingston at gmail.com>> 
>> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> 
>> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? 
>> Message-ID: 
>> 
> <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at mail.gmail.com<mailto:CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD%2B9Edv2pi71Dw at mail.gmail.com>> 
>> Content-Type: text/plain; charset="utf-8" 
>> 
>> On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston 
> <halivingston at gmail.com<mailto:halivingston at gmail.com>> 
>> wrote: 
>> 
>>> For JITs it would appear that there is a patch needed for some kind of 
>>> relocations. 
>>> 
>>> https://llvm.org/bugs/show_bug.cgi?id=24233 
>>> 
>>> Is the patch really needed? What does it do? I'm not an expert here so 
>>> asking. 
>>> 
>> 
>> 
>> I'm not really interested in the JIT case as I said originally, so I can't 
>> answer that question. 
>> 
>> 
>> 
>>> 
>>> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev 
>>> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: 
>>>> I can confirm that LLVM emits correct data when used in an AoT 
>>> configuration 
>>>> for x64, exception handling would be totally broken without it. 
>>>> 
>>> 
>> 
>> 
>> Two points of clarification: 
>> 
>> - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)? 
>> Again given the presence of bugs going back to 2015 (including one linked 
>> in this thread) and other scant data from the list, I really can't tell 
>> what the expected state of this functionality is on Win64. 
>> 
>> - Are you referring to data generated by LLVM that is embedded in COFF 
>> object files and then placed in the binary image by the linker? This data 
>> is at a minimum relocated by link.exe on Windows as near as I can tell. I 
>> do not want a dependency on link.exe. I can handle doing my own relocations 
>> prior to emitting the final image, but I want to know if there's a turnkey 
>> implementation of this already or if I have to roll my own here. 
>> 
>> Thanks, 
>> 
>> 
>> 
>> - Mike 
> 
> 
> 
> Windows/x64 ABI is pretty well documented. 
> 
> 
> - The parameter passing is probably not the same as any other system. 
> (Unless people are using LLVM for UEFI development?) 
> Ignoring floating point, the first four integer parameters 
> are in rcx, rdx, r8, r9. The rest are on the stack. 
> 
> 
> - The exception handling might *resemble* other systems, but 
> surely has unique details. 
> 
> - Ghere is absolutely an unremovable dependency on a linker; 
> it doesn't have to be the Microsoft linker, I believe GNU ld 
> already implements this. 
> 
> The documentation should be used. 
> 
> I can summarize and such, but it is documented. 
> 
> Roughly, ignoring parameter passing and focusing only on exception 
> handling, 
> it goes like this: 
> 
> 
> - At any point in any program, "the stack" must be "unwindable". 
> I've never seen this clearly described. 
> It boils down to really "non volatile registers must be restorable" 
> by "a runtime" via a documented/standardized metadata, such as to 
> appear as if control was returned to any function on the call stack, 
> w/o running any generated code in any of the functions between 
> the current stack location and the resumed-to location. 
> 
> 
> The stack pointer is often called out specially, but in fact 
> it is just another non volatile register and not really a 
> special case. 
> 
> 
> So then some details: 
> a "leaf function" is a function that does not change any non 
> volatile registers, 
> including the stack pointer. Leaf functions can do pretty much 
> anything, 
> but they must not change any non volatile registers -- which is 
> a severe 
> restriction. Have locals essentially makes you non-leaf -- even if you 
> don't call anything. A leaf function is *not* a function that 
> makes no calls, 
> but calls do make a function a non-leaf, as it changes the stack 
> pointer. 
> 
> 
> The slight exception here is that all functions, including 
> leaves, do have 
> 4*8 bytes of scratch space in the stack available to them -- so local 
> variables can be had, in that space and in volatile registers. 
> 
> 
> The stack is walked from a leaf function merely by reading from rsp. 
> A leaf function can make a syscall, so they aren't necessarily at 
> the bottom of the stack. 
> 
> 
> non-leaf functions are the interesting ones. 
> They can change rsp, including such as via a call, and can change 
> non-volatile 
> registers, but all such changes (or rather, the saving of said 
> registers) must 
> be described by metadata, and the metadata 
> must be findable -- via looking up a code address on the stack. 
> 
> 
> Roughly speaking, all dlls have "pdata" -- procedure data. 
> There are 3 UINT32s per non-leaf function. 
> These are offsets into the image. Images are limited to 4GB in size. 
> They are to the start of the function, end of the function, and 
> to additional metadata. 
> The additional metadata is called "xdata" or exception data. 
> The offset to the metadata be be absent or 0, but that should be 
> rare/nonexistant 
> in practise -- it is for revealing leaf functions to static 
> analysis for example. 
> 
> 
> The "xdata" is then what describes how to restore non volatile 
> registers, 
> such as the order to pop them, or what offset they were saved at to the 
> frame pointer or stack pointer (and which register if any is the 
> frame pointer -- it doesn't have to be rbp, 
> and most functions don't have one.) 
> 
> 
> There are restrictions on code generation -- rsp changes and non 
> volatile saves 
> must be describable with this metadata. There is a notion of the 
> end of the prologue, 
> at this point all non volatiles that will be changed have been 
> saved, and rsp changes 
> are done. This is misleading though in that almost arbitrary code 
> can be interleaved 
> within the prologue, i.e. changes to volatile registers. 
> 
> 
> As well, as a background, generally Windows/x64 functions don't 
> change rsp, 
> except in their prologue and the call instruction. 
> They are not "pushy/poppp". However if a function uses _alloca, that 
> is a contradiction. Such functions must have a frame pointer, 
> such as rbp, 
> though it doesn't have to be rbp and often is not. 
> 
> 
> There is also a notion of chaining the data. This is useful when 
> a function has "early out" paths that only change some non volatiles. 
> 
> 
> Also there is allowance for discontiguous functions. 
> 
> 
> Also there is no metadata for epilogues. If an exception occurs 
> in an epilogue, 
> the runtime actually look at the code being run, detects it is an 
> epilogue 
> and simulates it. As such, epilogue code generation is constrained. 
> (and breakpoints within epilogues mess things up!) 
> 
> These is metadata for epilogues (UWOP_EPILOG) but it is only available 
> on Windows 8.1 and newer. 
> 
> 
> 
> To repeat -- the unwindability is from any single instruction, be 
> in the 
> middle of a prologue, middle of an epilogue, or in the body of a 
> function 
> outside of prologue/epilogue. 
> 
> 
> This unwindabilty serves both exception dispatch and debugger 
> stack walking, 
> and other things, like sampling profiler stack walking, or "leak 
> tracking 
> stack walking" -- stack walking is always possible, modulo bugs. 
> The most common bugs are probably in hand written assemble, since 
> assembly programmers have to do basically the work themselves. 
> 
> 
> There is provision for providing the pdata at runtime for JITed code. 
> 
> 
> The linker has to combine all the pdata and place a pointer 
> (offset) to it 
> in a documented place in the PE, similar to how imports and 
> exports and base 
> relocations are recorded. 
> 
> 
> Anyway, see the documentation. 
> 
> 
> - Jay 
> _______________________________________________ 
> LLVM Developers mailing list 
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev 
>