[LLVMdev] Preserving accurate stack traces with optimization?
David Blaikie
dblaikie at gmail.com
Wed Oct 30 11:21:26 PDT 2013
Actually CCing Eric.
On Wed, Oct 30, 2013 at 11:00 AM, Quentin Colombet <qcolombet at apple.com>wrote:
> Philip,
>
> Thanks for the clarification.
>
> As far as I can tell, there is currently no way to preserve a full and
> accurate stack trace while utilizing most of LLVM’s optimization abilities.
>
> The work on debug information may help you get the information you need,
> but I do not think we will provide information on stack frames that have
> been removed via inlining or tail call.
>
In theory, at -gmlt we should emit enough debug info to give you accurate
stack traces including inlined frames. Tail calls I assume we can't do
anything about.
> Moreover, if at some point you also need the values of the arguments of a
> removed stack frame, this seems heroic to be able to provide such
> information.
>
Also, in theory, we should be able to describe the locations of function
arguments/parameters to that inlined call, except for those optimized away
entirely - but that's probably not quite as well tested/implemented at this
point.
> This is my understanding of what we have currently, folks working on the
> debug support may give you more inputs on that (CC’ed Eric).
> As for the sanitizer, I have no idea what stack trace they are reporting,
> I let them comment on that.
>
I believe they use llvm-symbolizer which uses the debug info to get inlined
stack frames. That's the main use case for -gmlt.
>
> ** To Eric **
> Could you comment on the way we are generating stack frame information and
> in particular how inlining is handled, i.e., does an inlined function
> showed up in the stack frame information (seems unlikely, but who knows :)).
>
> -Quentin
>
> On Oct 30, 2013, at 10:24 AM, Philip Reames <listmail at philipreames.com>
> wrote:
>
> On 10/30/13 9:56 AM, Quentin Colombet wrote:
>
> Hi Philip,
>
> Could you define what is an accurate stack trace for your project?
> In other words, what do you mean by full and accurate stack frame?
>
> Without this definition, this is difficult to give you any feedback. In
> particular, I do not see what it means when we use inlining.
>
> Sure. Just to note, I *think* your example was exactly what we're looking
> for. I got a bit confused about your notation, so I'm going to start from
> scratch.
>
> By a "full and accurate stack trace" in the face of inlining, I mean the
> exact stack trace you would get without any inlining (i.e. in a unoptimized
> build.) To put this another way, I need to be able to distinguish the path
> by which a function was inlined. Consider the following example (in
> approximate C):
>
> void a() {
> if( randomly_true ) print_stack_trace();
> }
> void b() {
> a();
> }
> void c() {
> a();
> }
> void main() {
> b();
> c();
> }
>
> In our environment, we need to be able to distinguish the traces
> "a;b;main" from "a;c;main" reliably. We need this regardless of what
> decisions the optimizer might make about inlining (or other optimizations
> for that matter).
>
> For another example, "a" might be a routine which requires privileges to
> execute. "b" might a routine which adds privileges. "c" might be an
> untrusted routine. Calling "a" from "b" will succeed. Calling "a" from
> "c" will generate an exception. (We can handle all the details around when
> to throw exceptions if the information about stack traces is accurate and
> trustworthy.)
>
> Side note: The permission question above can't be addressed statically.
> The call to the privileged routine doesn't have to be direct. "a" could be
> a series of frames "a1...aN" where "aN" is actually the privileged one.
> There can also be virtual (or other runtime dispatch) calls in that path
> which prevent static analysis.
>
> Does that help clarify what we're looking for?
>
> Philip
>
>
> E.g., what do you expect from code like this:
> static void fct1(…) {
> ...
> }
>
> static void fct2(…) {
> …
> fct1(…)
> ...
> }
>
> void fct3(…) {
> fct1(...)
> …
> fct2(…)
> …
> }
>
> Assuming everything is inlined in fct3, you get:
> void fct3(…) {
> ….
> 1. fct1_inst1… fct1_instN
> ….
> 2. fct2_inst1… fct2_instK
> 3. fct1_inst1… fct1_instN
> 4. fct2_instzK+1… fct2_instN
> ...
> }
>
> Does it mean you what something like this each point of interest for you
> stack frame:
> 1.
> #0 fct1
> #1 fct3
>
> 2.
> #0 fct2
> #1 fct3
>
> 3.
> #0 fct1
> #1 fct2
> #2 fct3
>
> 4.
> #0 fct2
> #1 fct3
>
> Cheers,
> -Quentin
>
> On Oct 28, 2013, at 2:56 PM, Philip Reames <listmail at philipreames.com>
> wrote:
>
> Is there a known way to preserve a full and accurate stack trace while
> utilizing most of LLVM's optimization abilities?
>
> We are investigating using LLVM as a JIT for a language which requires the
> ability to generate an accurate stack trace from any arbitrary point(1)
> during the execution. I know that we can make this work by doing inlining
> externally, manually recording virtual frames, and disabling optimizations
> such as tail call optimizations. To me, this seems like an unpleasant hack
> that would likely inhibit much of LLVM's built in optimizing ability. I
> suspect that if we ended up having to pursue this strategy, it would likely
> greatly diminish the benefit we could get by moving to an LLVM backend. (2)
>
> Currently, I am aware of two lines of related work. First, I know that
> there has been some work into enabling full speed debug builds (-g -O3) for
> Clang which may be related. Second, I know that the various sanitizer
> tools include stack traces in their reporting.
>
> What I have not been able to establish is the intended semantics of these
> approaches. Is the intent that a stack trace will always be preserved? Or
> simply that a best effort will be made to preserve the stack trace? Since
> for us the need to preserve a full stack trace is a matter of correctness,
> we couldn't use a mechanism which only provided best effort semantics.
>
> Are there other lines of related work that I have missed? Are there any
> other language implementations out there that have already solved this
> problem? I would welcome references to existing implementations or
> suggestions on how to approach this problem.
>
> Philip
>
> p.s. I know that there are a number of possible approaches to identifying
> when a bit of code doesn't actually need a full stack trace and optimizing
> these more aggressively. We're considering a number of these approaches,
> but I am mostly interested in identifying a reasonable high performance
> base implementation at this time. (Feel free to comment if you think this
> is the wrong approach.)
>
> (1) Technically, the semantics are slightly more limited then I've
> described. The primary usage is for exceptions, security checking, and a
> couple of rarely used routines in the standard library.
> (2) I haven't actually measured this yet. If anyone feels my intuition is
> likely off here, let me know and I'll invest the time to actually do so.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131030/621b9e53/attachment.html>
More information about the llvm-dev
mailing list