<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 05/02/2014 04:37 PM, Kevin
Modzelewski wrote:<br>
</div>
<blockquote
cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>That's definitely good confirmation to hear that the
test+branch for every call does in fact add noticeable
overhead -- thanks for the datapoints.<br>
</div>
<div><br>
</div>
<div style="">What I'm taking away from this is that even within
the space of "unwind-based exception handling using DWARF CFI
side-tables", there is a fair amount of room for different
approaches with different tradeoffs, and also potentially room
for a custom-tailored unwinder to beat libgcc. That's
definitely good to know, and you guys have encouraged me to
peel back the magic one more layer and try to implement my own
unwinder :)</div>
</div>
</blockquote>
Fair warning, I have absolutely no idea if our current
implementation is actually a good idea or not. We need to get back
to that and actually benchmark the various options. :) We've been
experimenting wildly, but without much rigour. We've been mainly
focused on identifying the possible options within LLVM. <br>
<br>
<blockquote
cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div style=""><br>
</div>
<div style="">As for switching between unwind-based exceptions
and checked-status-code exceptions, I'm not quite sure I buy
that that can completely be done by the catching function,
since the throwing function also needs to use the matching
mechanism. <br>
</div>
</div>
</blockquote>
I think what we do at the moment is *always* set the 'pending
exception' flag, even if we're going to use the unwind table based
dispatching. As a result, any frame can decide to use either
mechanism. I'll point out though that this is purely an accident of
implementation. We didn't purposely design it this way. :)<br>
<br>
<br>
<blockquote
cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div style="">I think if you truly want to do this, you need to
compile separate variants of whatever functions you might call
(including whatever functions they might call), one for each
exception mechanism you want to use. I'm thinking about doing
this, but only for certain built-in functions that are
expected to throw a lot. Another option I'm thinking of is to
inline those particular functions and then create an
optimization pass that will know that py_throw always throws,
and stitch up the CFG appropriately. Anyway, lots to chew on,
thanks everyone for the responses!</div>
</div>
</blockquote>
I'll just mention that you really really want to translate
throw/catch pairs in the same function into a direct jump where
possible. :) In fact, LLVM should be doing this for you during
inlining if you structure your IR properly. Are you not seeing this
in practice?<br>
<br>
<blockquote
cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div style=""><br>
</div>
<div style=""><br>
</div>
<div style=""><br>
</div>
<div style="">Aside about Python exceptions: Python has
interesting for loops, which are always for-each loops and
implement the termination condition using exceptions:</div>
<div style=""><br>
</div>
<div style="">PyObject *iterator; // what we're iterating over</div>
<div style="">while (true) {</div>
<div style=""> PyObject* i;</div>
<div style=""> try {</div>
<div style=""> i = iterator.next();</div>
<div style=""> } except (StopIteration) {</div>
<div style=""> break;</div>
<div style=""> }</div>
<div style=""> // do stuff</div>
<div style="">}</div>
<div style=""><br>
</div>
<div style="">Percentage-wise, throwing the StopIteration might
be rare, but I would wager that most loops get terminated this
way (as opposed to a "break" statement) so it's certainly not
never; I think this means the exception gets thrown enough
that it's better to handle the exception in-line rather than
do a deopt-on-throw. Microbenchmarks suggest that for-loop
overhead is important enough that it's further worth trying to
avoid any exception-related unwinding entirely, but I'm not
sure how true that is for larger programs (probably somewhat
true).</div>
</div>
</blockquote>
For this case in particular, you probably want to avoid throwing
exceptions at all. If you inline the next() function to expose the
throw, you should be able to convert the "throw; catch;" into a
branch to the exit block. This will really really help your
performance as compared to just about any other option.<br>
<br>
Philip<br>
<blockquote
cite="mid:CAO=oM6sA5L8VqwDas8AKTgj1V10KzFA6kakXaF4MSBgSune14A@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra">
<div class="gmail_quote">On Fri, May 2, 2014 at 12:43 PM,
Sanjoy Das <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:sanjoy@azulsystems.com" target="_blank">sanjoy@azulsystems.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi
Kevin,<br>
<br>
To elaborate on Philip's point, depending on the state
Pyston's<br>
runtime already is in, you may have the choice of using a
hybrid of a<br>
"pending exception" word in your runtime thread structure,
and an<br>
implicit alternate ("exceptional") return address for
calls into<br>
functions that may throw. This lets you elide the check
on the<br>
pending exception word after calls by turning them into
invokes that<br>
unwind into a landingpad containing a generic exception
handler. This<br>
generic exception handler then checks the type of the
pending<br>
exception word and handles the exception (which may
involve rethrowing<br>
to the caller if the current frame doesn't have catch
handler).<br>
<br>
Instead of relying on libgcc to unwind when you throw you
can then<br>
parse the [call PC, generic exception handling PC] pairs
from the<br>
.eh_frame section, and when throwing to your caller, look
up the<br>
generic exception handling PC (using the call PC pushed on
the stack)<br>
and "return" to that instead. Rethrow is similar.<br>
<br>
This scheme has the disadvantage of "returning" through
every active<br>
frame on an exception throw, even if a particular frame
never had an<br>
exception handler and could've been skipped safely.
However, this<br>
scheme allows you to easily switch to one of two other
implementations<br>
based on profiling data on a per-callsite basis:<br>
<br>
1. high exception volume -- if an invoke has seen too
many exception<br>
throws, recompile by replacing the invoke with a call
followed by<br>
a test of "pending exception" and branch. The logic
to generate<br>
the branch target should largely be the same as logic
to generate<br>
the landing pad block.<br>
<br>
2. low exception volume -- keep the invoke, but put a
deoptimization<br>
trap in the landing pad block.<br>
<br>
We did some rough benchmarking, and using such implicit
exceptions<br>
(i.e. not explicitly checking the pending exception word)
reduces<br>
non-throwing call overhead by 20-25%. I don't have any
numbers on how<br>
it affects the performance of exceptional control flow
though.<br>
<span class=""><font color="#888888"><br>
-- Sanjoy<br>
<br>
</font></span></blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>