<div dir="ltr">Actually CCing Eric.<div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Oct 30, 2013 at 11:00 AM, Quentin Colombet <span dir="ltr"><<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Philip,</div><div><br></div><div>Thanks for the clarification.</div><div><br></div>

<div>As far as I can tell, there is currently no way to preserve a full and accurate stack trace while utilizing most of LLVMТs optimization abilities.</div><div><br></div><div>The work on debug information may help you get the information you need, but I do not think we will provide information on stack frames that have been removed via inlining or tail call. </div>

</div></blockquote><div><br></div><div>In theory, at -gmlt we should emit enough debug info to give you accurate stack traces including inlined frames. Tail calls I assume we can't do anything about.</div><div>а</div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Moreover, if at some point you also need the values of the arguments of a removed stack frame, this seems heroic to be able to provide such information.</div>

</div></blockquote><div><br></div><div>Also, in theory, we should be able to describe the locations of function arguments/parameters to that inlined call, except for those optimized away entirely - but that's probably not quite as well tested/implemented at this point.</div>

<div>а</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>This is my understanding of what we have currently, folks working on the debug support may give you more inputs on that (CCТed Eric).<br>

</div><div>As for the sanitizer, I have no idea what stack trace they are reporting, I let them comment on that.</div></div></blockquote><div><br></div><div>I believe they use llvm-symbolizer which uses the debug info to get inlined stack frames. That's the main use case for -gmlt.</div>

<div>а</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><br></div><div>** To Eric **</div><div>Could you comment on the way we are generating stack frame information and in particular how inlining is handled, i.e., does an inlined function showed up in the stack frame information (seems unlikely, but who knows :)).</div>

<span class="HOEnZb"><font color="#888888"><div><br></div><div>-Quentin</div></font></span><div><div class="h5"><div><br></div>On Oct 30, 2013, at 10:24 AM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>> wrote:<br>

<div><br><blockquote type="cite">

  <div bgcolor="#FFFFFF" text="#000000">

    <div>On 10/30/13 9:56 AM, Quentin Colombet

      wrote:<br>

    </div>

    <blockquote type="cite">

      Hi Philip,

      <div><br>

      </div>

      <div>Could you define what is an accurate stack trace for your

        project?</div>

      <div>In other words, what do you mean by full and accurate stack

        frame?</div>

      <div><br>

      </div>

      <div>Without this definition, this is difficult to give you any

        feedback. In particular, I do not see what it means when we use

        inlining.</div>

    </blockquote>

    Sure.а Just to note, I *think* your example was exactly what we're

    looking for.а I got a bit confused about your notation, so I'm going

    to start from scratch.а <br>

    <br>

    By a "full and accurate stack trace" in the face of inlining, I mean

    the exact stack trace you would get without any inlining (i.e. in a

    unoptimized build.)а To put this another way, I need to be able to

    distinguish the path by which a function was inlined.а Consider the

    following example (in approximate C):<br>

    <br>

    void a() {<br>

    а if( randomly_true ) print_stack_trace();<br>

    }<br>

    void b() {<br>

    а a();<br>

    }<br>

    void c() {<br>

    а a();<br>

    }<br>

    void main() {<br>

    а b();<br>

    а c();<br>

    }<br>

    <br>

    In our environment, we need to be able to distinguish the traces

    "a;b;main" from "a;c;main" reliably.а We need this regardless of

    what decisions the optimizer might make about inlining (or other

    optimizations for that matter).а <br>

    <br>

    For another example, "a" might be a routine which requires

    privileges to execute.а "b" might a routine which adds privileges.а

    "c" might be an untrusted routine.а Calling "a" from "b" will

    succeed.а Calling "a" from "c" will generate an exception.а (We can

    handle all the details around when to throw exceptions if the

    information about stack traces is accurate and trustworthy.)а <br>

    <br>

    Side note: The permission question above can't be addressed

    statically.а The call to the privileged routine doesn't have to be

    direct.а "a" could be a series of frames "a1...aN" where "aN" is

    actually the privileged one.а There can also be virtual (or other

    runtime dispatch) calls in that path which prevent static analysis.<br>

    <br>

    Does that help clarify what we're looking for?<br>

    <br>

    Philip<br>

    <br>

    <br>

    <blockquote type="cite">

      <div>E.g., what do you expect from code like this:</div>

      <div>static void fct1(Е) {</div>

      <div>а ...</div>

      <div>}</div>

      <div><br>

      </div>

      <div>static void fct2(Е) {</div>

      <div>а Е</div>

      <div>а fct1(Е)</div>

      <div>а ...</div>

      <div>}</div>

      <div><br>

      </div>

      <div>void fct3(Е) {</div>

      <div>а fct1(...)</div>

      <div>а Е</div>

      <div>а fct2(Е)</div>

      <div>а Е</div>

      <div>}</div>

      <div><br>

      </div>

      <div>Assuming everything is inlined in fct3, you get:</div>

      <div>void fct3(Е) {</div>

      <div>а аЕ.</div>

      <div>1. а fct1_inst1Е fct1_instN</div>

      <div>а аЕ.</div>

      <div>2. а fct2_inst1Е fct2_instK</div>

      <div>3. а fct1_inst1Е fct1_instN</div>

      <div>4. а fct2_instzK+1Е fct2_instN</div>

      <div>а а...</div>

      <div>}</div>

      <div><br>

      </div>

      <div>Does it mean you what something like this each point of

        interest for you stack frame:</div>

      <div>1.</div>

      <div>#0 fct1</div>

      <div>#1 fct3</div>

      <div><br>

      </div>

      <div>2.</div>

      <div>

        <div>#0 fct2</div>

        <div>#1 fct3</div>

      </div>

      <div><br>

      </div>

      <div>

        <div>3.</div>

        <div>

          <div>#0 fct1</div>

          <div>#1 fct2</div>

          <div>#2 fct3</div>

        </div>

      </div>

      <div><br>

      </div>

      <div>

        <div>

          <div>4.</div>

          <div>

            <div>#0 fct2</div>

          </div>

        </div>

      </div>

      <div>#1 fct3</div>

      <div><br>

      </div>

      <div>Cheers,<br>

        <div>

          <div style="font-family:Helvetica;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">

-Quentin</div>

        </div>

        <br>

        <div>

          <div>On Oct 28, 2013, at 2:56 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>

            wrote:</div>

          <br>

          <blockquote type="cite">Is there a known way to preserve a

            full and accurate stack trace while utilizing most of LLVM's

            optimization abilities?<br>

            <br>

            We are investigating using LLVM as a JIT for a language

            which requires the ability to generate an accurate stack

            trace from any arbitrary point(1) during the execution. аI

            know that we can make this work by doing inlining

            externally, manually recording virtual frames, and disabling

            optimizations such as tail call optimizations. To me, this

            seems like an unpleasant hack that would likely inhibit much

            of LLVM's built in optimizing ability. аI suspect that if we

            ended up having to pursue this strategy, it would likely

            greatly diminish the benefit we could get by moving to an

            LLVM backend. (2)<br>

            <br>

            Currently, I am aware of two lines of related work. аFirst,

            I know that there has been some work into enabling full

            speed debug builds (-g -O3) for Clang which may be related.

            аSecond, I know that the various sanitizer tools include

            stack traces in their reporting.<br>

            <br>

            What I have not been able to establish is the intended

            semantics of these approaches. аIs the intent that a stack

            trace will always be preserved? аOr simply that a best

            effort will be made to preserve the stack trace? Since for

            us the need to preserve a full stack trace is a matter of

            correctness, we couldn't use a mechanism which only provided

            best effort semantics.<br>

            <br>

            Are there other lines of related work that I have missed?

            аAre there any other language implementations out there that

            have already solved this problem? аI would welcome

            references to existing implementations or suggestions on how

            to approach this problem.<br>

            <br>

            Philip<br>

            <br>

            p.s. I know that there are a number of possible approaches

            to identifying when a bit of code doesn't actually need a

            full stack trace and optimizing these more aggressively.

            аWe're considering a number of these approaches, but I am

            mostly interested in identifying a reasonable high

            performance base implementation at this time. а(Feel free to

            comment if you think this is the wrong approach.)<br>

            <br>

            (1) Technically, the semantics are slightly more limited

            then I've described. аThe primary usage is for exceptions,

            security checking, and a couple of rarely used routines in

            the standard library.<br>

            (2) I haven't actually measured this yet. аIf anyone feels

            my intuition is likely off here, let me know and I'll invest

            the time to actually do so.<br>

            _______________________________________________<br>

            LLVM Developers mailing list<br>

            <a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>

            аааааааа<a href="http://llvm.cs.uiuc.edu/" target="_blank">http://llvm.cs.uiuc.edu</a><br>

            <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

          </blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div><br></div></div></div><br>_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> а а а а <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

<br></blockquote></div><br></div></div>