<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 05/02/2014 04:37 PM, Kevin

      Modzelewski wrote:<br>

    </div>

    <blockquote

cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>That's definitely good confirmation to hear that the

          test+branch for every call does in fact add noticeable

          overhead -- thanks for the datapoints.<br>

        </div>

        <div><br>

        </div>

        <div style="">What I'm taking away from this is that even within

          the space of "unwind-based exception handling using DWARF CFI

          side-tables", there is a fair amount of room for different

          approaches with different tradeoffs, and also potentially room

          for a custom-tailored unwinder to beat libgcc.  That's

          definitely good to know, and you guys have encouraged me to

          peel back the magic one more layer and try to implement my own

          unwinder :)</div>

      </div>

    </blockquote>

    Fair warning, I have absolutely no idea if our current

    implementation is actually a good idea or not.  We need to get back

    to that and actually benchmark the various options.  :)  We've been

    experimenting wildly, but without much rigour.  We've been mainly

    focused on identifying the possible options within LLVM.  <br>

    <br>

    <blockquote

cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div style=""><br>

        </div>

        <div style="">As for switching between unwind-based exceptions

          and checked-status-code exceptions, I'm not quite sure I buy

          that that can completely be done by the catching function,

          since the throwing function also needs to use the matching

          mechanism.  <br>

        </div>

      </div>

    </blockquote>

    I think what we do at the moment is *always* set the 'pending

    exception' flag, even if we're going to use the unwind table based

    dispatching.  As a result, any frame can decide to use either

    mechanism.  I'll point out though that this is purely an accident of

    implementation.  We didn't purposely design it this way.  :)<br>

    <br>

    <br>

    <blockquote

cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div style="">I think if you truly want to do this, you need to

          compile separate variants of whatever functions you might call

          (including whatever functions they might call), one for each

          exception mechanism you want to use.  I'm thinking about doing

          this, but only for certain built-in functions that are

          expected to throw a lot.  Another option I'm thinking of is to

          inline those particular functions and then create an

          optimization pass that will know that py_throw always throws,

          and stitch up the CFG appropriately.  Anyway, lots to chew on,

          thanks everyone for the responses!</div>

      </div>

    </blockquote>

    I'll just mention that you really really want to translate

    throw/catch pairs in the same function into a direct jump where

    possible.  :)  In fact, LLVM should be doing this for you during

    inlining if you structure your IR properly.  Are you not seeing this

    in practice?<br>

    <br>

    <blockquote

cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div style=""><br>

        </div>

        <div style=""><br>

        </div>

        <div style=""><br>

        </div>

        <div style="">Aside about Python exceptions: Python has

          interesting for loops, which are always for-each loops and

          implement the termination condition using exceptions:</div>

        <div style=""><br>

        </div>

        <div style="">PyObject *iterator; // what we're iterating over</div>

        <div style="">while (true) {</div>

        <div style="">    PyObject* i;</div>

        <div style="">    try {</div>

        <div style="">        i = iterator.next();</div>

        <div style="">    } except (StopIteration) {</div>

        <div style="">        break;</div>

        <div style="">    }</div>

        <div style="">    // do stuff</div>

        <div style="">}</div>

        <div style=""><br>

        </div>

        <div style="">Percentage-wise, throwing the StopIteration might

          be rare, but I would wager that most loops get terminated this

          way (as opposed to a "break" statement) so it's certainly not

          never; I think this means the exception gets thrown enough

          that it's better to handle the exception in-line rather than

          do a deopt-on-throw.  Microbenchmarks suggest that for-loop

          overhead is important enough that it's further worth trying to

          avoid any exception-related unwinding entirely, but I'm not

          sure how true that is for larger programs (probably somewhat

          true).</div>

      </div>

    </blockquote>

    For this case in particular, you probably want to avoid throwing

    exceptions at all.  If you inline the next() function to expose the

    throw, you should be able to convert the "throw; catch;" into a

    branch to the exit block.  This will really really help your

    performance as compared to just about any other option.<br>

    <br>

    Philip<br>

    <blockquote

cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"

      type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra">

          <div class="gmail_quote">On Fri, May 2, 2014 at 12:43 PM,

            Sanjoy Das <span dir="ltr"><<a moz-do-not-send="true"

                href="mailto:sanjoy@azulsystems.com" target="_blank">sanjoy@azulsystems.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi

              Kevin,<br>

              <br>

              To elaborate on Philip's point, depending on the state

              Pyston's<br>

              runtime already is in, you may have the choice of using a

              hybrid of a<br>

              "pending exception" word in your runtime thread structure,

              and an<br>

              implicit alternate ("exceptional") return address for

              calls into<br>

              functions that may throw.  This lets you elide the check

              on the<br>

              pending exception word after calls by turning them into

              invokes that<br>

              unwind into a landingpad containing a generic exception

              handler.  This<br>

              generic exception handler then checks the type of the

              pending<br>

              exception word and handles the exception (which may

              involve rethrowing<br>

              to the caller if the current frame doesn't have catch

              handler).<br>

              <br>

              Instead of relying on libgcc to unwind when you throw you

              can then<br>

              parse the [call PC, generic exception handling PC] pairs

              from the<br>

              .eh_frame section, and when throwing to your caller, look

              up the<br>

              generic exception handling PC (using the call PC pushed on

              the stack)<br>

              and "return" to that instead.  Rethrow is similar.<br>

              <br>

              This scheme has the disadvantage of "returning" through

              every active<br>

              frame on an exception throw, even if a particular frame

              never had an<br>

              exception handler and could've been skipped safely.

               However, this<br>

              scheme allows you to easily switch to one of two other

              implementations<br>

              based on profiling data on a per-callsite basis:<br>

              <br>

               1. high exception volume -- if an invoke has seen too

              many exception<br>

                  throws, recompile by replacing the invoke with a call

              followed by<br>

                  a test of "pending exception" and branch.  The logic

              to generate<br>

                  the branch target should largely be the same as logic

              to generate<br>

                  the landing pad block.<br>

              <br>

               2. low exception volume -- keep the invoke, but put a

              deoptimization<br>

                  trap in the landing pad block.<br>

              <br>

              We did some rough benchmarking, and using such implicit

              exceptions<br>

              (i.e. not explicitly checking the pending exception word)

              reduces<br>

              non-throwing call overhead by 20-25%.  I don't have any

              numbers on how<br>

              it affects the performance of exceptional control flow

              though.<br>

              <span class=""><font color="#888888"><br>

                  -- Sanjoy<br>

                  <br>

                </font></span></blockquote>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    <br>

  </body>

</html>