[LLVMdev] Preserving accurate stack traces with optimization?

Mon Oct 28 14:56:13 PDT 2013

Is there a known way to preserve a full and accurate stack trace while 
utilizing most of LLVM's optimization abilities?

We are investigating using LLVM as a JIT for a language which requires 
the ability to generate an accurate stack trace from any arbitrary 
point(1) during the execution.  I know that we can make this work by 
doing inlining externally, manually recording virtual frames, and 
disabling optimizations such as tail call optimizations. To me, this 
seems like an unpleasant hack that would likely inhibit much of LLVM's 
built in optimizing ability.  I suspect that if we ended up having to 
pursue this strategy, it would likely greatly diminish the benefit we 
could get by moving to an LLVM backend. (2)

Currently, I am aware of two lines of related work.  First, I know that 
there has been some work into enabling full speed debug builds (-g -O3) 
for Clang which may be related.  Second, I know that the various 
sanitizer tools include stack traces in their reporting.

What I have not been able to establish is the intended semantics of 
these approaches.  Is the intent that a stack trace will always be 
preserved?  Or simply that a best effort will be made to preserve the 
stack trace? Since for us the need to preserve a full stack trace is a 
matter of correctness, we couldn't use a mechanism which only provided 
best effort semantics.

Are there other lines of related work that I have missed?  Are there any 
other language implementations out there that have already solved this 
problem?  I would welcome references to existing implementations or 
suggestions on how to approach this problem.

Philip

p.s. I know that there are a number of possible approaches to 
identifying when a bit of code doesn't actually need a full stack trace 
and optimizing these more aggressively.  We're considering a number of 
these approaches, but I am mostly interested in identifying a reasonable 
high performance base implementation at this time.  (Feel free to 
comment if you think this is the wrong approach.)

(1) Technically, the semantics are slightly more limited then I've 
described.  The primary usage is for exceptions, security checking, and 
a couple of rarely used routines in the standard library.
(2) I haven't actually measured this yet.  If anyone feels my intuition 
is likely off here, let me know and I'll invest the time to actually do so.